1. 05 3月, 2014 2 次提交
  2. 04 3月, 2014 1 次提交
  3. 28 2月, 2014 2 次提交
    • L
      net: move net_device priv_flags out from UAPI · 7aa98047
      Luis R. Rodriguez 提交于
      These are private to userspace, and they're unstable
      anyway and can be shuffled at will (see 080e4130)
      so any userspace application relying on them is on crack.
      
      Test compiled with allyesconfig.
      
      mcgrof@drvbp1 /pub/mem/mcgrof/net-next (git::master)$ make allyesconfig
      mcgrof@drvbp1 /pub/mem/mcgrof/net-next (git::master)$ time make -j 20
      ...
        BUILD   arch/x86/boot/bzImage
      Setup is 16992 bytes (padded to 17408 bytes).
      System is 56153 kB
      CRC 721d2751
      Kernel: arch/x86/boot/bzImage is ready  (#1)
      real    19m35.744s
      user    280m37.984s
      sys     27m54.104s
      
      Cc: netdev@vger.kernel.org
      Cc: linux-kernel@vger.kernel.org
      Cc: Ben Hutchings <ben@decadent.org.uk>
      Cc: Florian Fainelli <f.fainelli@gmail.com>
      Cc: David Miller <davem@davemloft.net>
      Signed-off-by: NLuis R. Rodriguez <mcgrof@suse.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7aa98047
    • L
      net: kdoc struct net_device flags and priv_flags · 589f5816
      Luis R. Rodriguez 提交于
      We have documentation for these flags but they're scattered
      all over the place. #defines don't allow documentation to be
      written easily so to help to start bringing some documentation
      together use the enums kdoc practice but keep the defines to
      allow userspace to be able to #ifdef them.
      
      I've verified the same values are assigned before and after
      with a simple userspace test program [0] and checksumming the
      output.
      
      [0] http://drvbp1.linux-foundation.org/~mcgrof/kdoc/netdev_flags/
      
      mcgrof@gnat ~/tmp $ ./check-flags | sha1sum
      0ec5b6b1840aa3bb9ce464e61c564820871c92c3  -
      
      Cc: netdev@vger.kernel.org
      Cc: linux-kernel@vger.kernel.org
      Cc: Ben Hutchings <ben@decadent.org.uk>
      Cc: Florian Fainelli <f.fainelli@gmail.com>
      Cc: David Miller <davem@davemloft.net>
      Signed-off-by: NLuis R. Rodriguez <mcgrof@suse.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      589f5816
  4. 27 2月, 2014 5 次提交
    • E
      tcp: switch rtt estimations to usec resolution · 740b0f18
      Eric Dumazet 提交于
      Upcoming congestion controls for TCP require usec resolution for RTT
      estimations. Millisecond resolution is simply not enough these days.
      
      FQ/pacing in DC environments also require this change for finer control
      and removal of bimodal behavior due to the current hack in
      tcp_update_pacing_rate() for 'small rtt'
      
      TCP_CONG_RTT_STAMP is no longer needed.
      
      As Julian Anastasov pointed out, we need to keep user compatibility :
      tcp_metrics used to export RTT and RTTVAR in msec resolution,
      so we added RTT_US and RTTVAR_US. An iproute2 patch is needed
      to use the new attributes if provided by the kernel.
      
      In this example ss command displays a srtt of 32 usecs (10Gbit link)
      
      lpk51:~# ./ss -i dst lpk52
      Netid  State      Recv-Q Send-Q   Local Address:Port       Peer
      Address:Port
      tcp    ESTAB      0      1         10.246.11.51:42959
      10.246.11.52:64614
               cubic wscale:6,6 rto:201 rtt:0.032/0.001 ato:40 mss:1448
      cwnd:10 send
      3620.0Mbps pacing_rate 7240.0Mbps unacked:1 rcv_rtt:993 rcv_space:29559
      
      Updated iproute2 ip command displays :
      
      lpk51:~# ./ip tcp_metrics | grep 10.246.11.52
      10.246.11.52 age 561.914sec cwnd 10 rtt 274us rttvar 213us source
      10.246.11.51
      
      Old binary displays :
      
      lpk51:~# ip tcp_metrics | grep 10.246.11.52
      10.246.11.52 age 561.914sec cwnd 10 rtt 250us rttvar 125us source
      10.246.11.51
      
      With help from Julian Anastasov, Stephen Hemminger and Yuchung Cheng
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Acked-by: NNeal Cardwell <ncardwell@google.com>
      Cc: Stephen Hemminger <stephen@networkplumber.org>
      Cc: Yuchung Cheng <ycheng@google.com>
      Cc: Larry Brakmo <brakmo@google.com>
      Cc: Julian Anastasov <ja@ssi.bg>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      740b0f18
    • H
      ipv6: yet another new IPV6_MTU_DISCOVER option IPV6_PMTUDISC_OMIT · 0b95227a
      Hannes Frederic Sowa 提交于
      This option has the same semantic as IP_PMTUDISC_OMIT for IPv4 which
      got recently introduced. It doesn't honor the path mtu discovered by the
      host but in contrary to IPV6_PMTUDISC_INTERFACE allows the generation of
      fragments if the packet size exceeds the MTU of the outgoing interface
      MTU.
      
      Fixes: 93b36cf3 ("ipv6: support IPV6_PMTU_INTERFACE on sockets")
      Cc: Florian Weimer <fweimer@redhat.com>
      Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0b95227a
    • H
      ipv4: yet another new IP_MTU_DISCOVER option IP_PMTUDISC_OMIT · 1b346576
      Hannes Frederic Sowa 提交于
      IP_PMTUDISC_INTERFACE has a design error: because it does not allow the
      generation of fragments if the interface mtu is exceeded, it is very
      hard to make use of this option in already deployed name server software
      for which I introduced this option.
      
      This patch adds yet another new IP_MTU_DISCOVER option to not honor any
      path mtu information and not accepting new icmp notifications destined for
      the socket this option is enabled on. But we allow outgoing fragmentation
      in case the packet size exceeds the outgoing interface mtu.
      
      As such this new option can be used as a drop-in replacement for
      IP_PMTUDISC_DONT, which is currently in use by most name server software
      making the adoption of this option very smooth and easy.
      
      The original advantage of IP_PMTUDISC_INTERFACE is still maintained:
      ignoring incoming path MTU updates and not honoring discovered path MTUs
      in the output path.
      
      Fixes: 482fc609 ("ipv4: introduce new IP_MTU_DISCOVER mode IP_PMTUDISC_INTERFACE")
      Cc: Florian Weimer <fweimer@redhat.com>
      Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1b346576
    • F
      net: tcp: add mib counters to track zero window transitions · 8e165e20
      Florian Westphal 提交于
      Three counters are added:
      - one to track when we went from non-zero to zero window
      - one to track the reverse
      - one counter incremented when we want to announce zero window,
        but can't because we would shrink current window.
      Suggested-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Acked-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8e165e20
    • N
      net: order MPLS ethertypes numerically · 2ebe21fd
      Neil Jerram 提交于
      All ethertypes other than ETH_P_MPLS_UC, ETH_P_MPLS_MC and
      ETH_P_ATMMPOA were already ordered numerically.  This commit moves
      those three ETH_P_... values into correct numerical order too.
      Signed-off-by: NNeil Jerram <Neil.Jerram@metaswitch.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2ebe21fd
  5. 17 2月, 2014 1 次提交
    • N
      ipsec: add support of limited SA dump · d3623099
      Nicolas Dichtel 提交于
      The goal of this patch is to allow userland to dump only a part of SA by
      specifying a filter during the dump.
      The kernel is in charge to filter SA, this avoids to generate useless netlink
      traffic (it save also some cpu cycles). This is particularly useful when there
      is a big number of SA set on the system.
      
      Note that I removed the union in struct xfrm_state_walk to fix a problem on arm.
      struct netlink_callback->args is defined as a array of 6 long and the first long
      is used in xfrm code to flag the cb as initialized. Hence, we must have:
      sizeof(struct xfrm_state_walk) <= sizeof(long) * 5.
      With the union, it was false on arm (sizeof(struct xfrm_state_walk) was
      sizeof(long) * 7), due to the padding.
      In fact, whatever the arch is, this union seems useless, there will be always
      padding after it. Removing it will not increase the size of this struct (and
      reduce it on arm).
      Signed-off-by: NNicolas Dichtel <nicolas.dichtel@6wind.com>
      Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>
      d3623099
  6. 15 2月, 2014 2 次提交
  7. 13 2月, 2014 13 次提交
  8. 08 2月, 2014 1 次提交
  9. 07 2月, 2014 1 次提交
    • J
      inet: defines IPPROTO_* needed for module alias generation · ee262ad8
      Jan Moskyto Matejka 提交于
      Commit cfd280c9 ("net: sync some IP headers with glibc") changed a set of
      define's to an enum (with no explanation why) which introduced a bug
      in module mip6 where aliases are generated using the IPPROTO_* defines;
      mip6 doesn't load if require_module called with the aliases from
      xfrm_get_type().
      
      Reverting this change back to define's to fix the aliases.
      
      modinfo mip6 (before this change)
      alias:          xfrm-type-10-IPPROTO_DSTOPTS
      alias:          xfrm-type-10-IPPROTO_ROUTING
      
      modinfo mip6 (after this change)
      alias:          xfrm-type-10-43
      alias:          xfrm-type-10-60
      Signed-off-by: NJan Moskyto Matejka <mq@suse.cz>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ee262ad8
  10. 05 2月, 2014 5 次提交
    • J
      cfg80211: regulatory introduce maximum bandwidth calculation · 97524820
      Janusz Dziedzic 提交于
      In case we will get regulatory request with rule
      where max_bandwidth_khz is set to 0 handle this
      case as a special one.
      
      If max_bandwidth_khz == 0 we should calculate maximum
      available bandwidth base on all frequency contiguous rules.
      In case we need auto calculation we just have to set:
      
      country PL: DFS-ETSI
              (2402 - 2482 @ 40), (N/A, 20)
              (5170 - 5250 @ AUTO), (N/A, 20)
              (5250 - 5330 @ AUTO), (N/A, 20), DFS
              (5490 - 5710 @ 80), (N/A, 27), DFS
      
      This mean we will calculate maximum bw for rules where
      AUTO (N/A) were set, 160MHz (5330 - 5170) in example above.
      So we will get:
              (5170 - 5250 @ 160), (N/A, 20)
              (5250 - 5330 @ 160), (N/A, 20), DFS
      
      In other case:
      country FR: DFS-ETSI
              (2402 - 2482 @ 40), (N/A, 20)
              (5170 - 5250 @ AUTO), (N/A, 20)
              (5250 - 5330 @ 80), (N/A, 20), DFS
              (5490 - 5710 @ 80), (N/A, 27), DFS
      
      We will get 80MHz (5250 - 5170):
              (5170 - 5250 @ 80), (N/A, 20)
              (5250 - 5330 @ 80), (N/A, 20), DFS
      
      Base on this calculations we will set correct channel
      bandwidth flags (eg. IEEE80211_CHAN_NO_80MHZ).
      
      We don't need any changes in CRDA or internal regulatory.
      Signed-off-by: NJanusz Dziedzic <janusz.dziedzic@tieto.com>
      [extend nl80211 description a bit, fix typo]
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      97524820
    • J
      nl80211: fix scheduled scan RSSI matchset attribute confusion · ea73cbce
      Johannes Berg 提交于
      The scheduled scan matchsets were intended to be a list of filters,
      with the found BSS having to pass at least one of them to be passed
      to the host. When the RSSI attribute was added, however, this was
      broken and currently wpa_supplicant adds that attribute in its own
      matchset; however, it doesn't intend that to mean that anything
      that passes the RSSI filter should be passed to the host, instead
      it wants it to mean that everything needs to also have higher RSSI.
      
      This is semantically problematic because we have a list of filters
      like [ SSID1, SSID2, SSID3, RSSI ] with no real indication which
      one should be OR'ed and which one AND'ed.
      
      To fix this, move the RSSI filter attribute into each matchset. As
      we need to stay backward compatible, treat a matchset with only the
      RSSI attribute as a "default RSSI filter" for all other matchsets,
      but only if there are other matchsets (an RSSI-only matchset by
      itself is still desirable.)
      
      To make driver implementation easier, keep a global min_rssi_thold
      for the entire request as well. The only affected driver is ath6kl.
      
      I found this when I looked into the code after Raja Mani submitted
      a patch fixing the n_match_sets calculation to disregard the RSSI,
      but that patch didn't address the semantic issue.
      Reported-by: NRaja Mani <rmani@qti.qualcomm.com>
      Acked-by: NLuciano Coelho <luciano.coelho@intel.com>
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      ea73cbce
    • J
      nl80211: add Guard Interval support for set_bitrate_mask · 0b9323f6
      Janusz Dziedzic 提交于
      Allow to force SGI, LGI.
      Mainly for test purpose.
      Signed-off-by: NJanusz Dziedzic <janusz.dziedzic@tieto.com>
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      0b9323f6
    • J
      cfg80211: Advertise maximum associated STAs in AP mode · b43504cf
      Jouni Malinen 提交于
      This allows drivers to advertise the maximum number of associated
      stations they support in AP mode (including P2P GO). User space
      applications can use this for cleaner way of handling the limit (e.g.,
      hostapd rejecting IEEE 802.11 authentication without manual
      configuration of the limit) or to figure out what type of use cases can
      be executed with multiple devices before trying and failing.
      Signed-off-by: NJouni Malinen <j@w1.fi>
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      b43504cf
    • J
      cfg80211: Allow BSS hint to be provided for connect · 1df4a510
      Jouni Malinen 提交于
      This clarifies the expected driver behavior on the older
      NL80211_ATTR_MAC and NL80211_ATTR_WIPHY_FREQ attributes and adds a new
      set of similar attributes with _HINT postfix to enable use of a
      recommendation of the initial BSS to choose. This can be helpful for
      some drivers that can avoid an additional full scan on connection
      request if the information is provided to them (user space tools like
      wpa_supplicant already has that information available based on earlier
      scans).
      
      In addition, this can be used to get more expected behavior for cases
      where a specific BSS should be picked first based on operations like
      Interworking network selection or WPS. These cases were already easily
      addressed with drivers that leave BSS selection to user space, but there
      was no convenient way to do this with drivers that take care of BSS
      selection internally without using the NL80211_ATTR_MAC which is not
      really desired since it is needed for other purposes to force the
      association to remain with the same BSS.
      Signed-off-by: NJouni Malinen <j@w1.fi>
      [add const, fix policy]
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      1df4a510
  11. 04 2月, 2014 1 次提交
  12. 29 1月, 2014 3 次提交
    • F
      Btrfs: add support for inode properties · 63541927
      Filipe David Borba Manana 提交于
      This change adds infrastructure to allow for generic properties for
      inodes. Properties are name/value pairs that can be associated with
      inodes for different purposes. They are stored as xattrs with the
      prefix "btrfs."
      
      Properties can be inherited - this means when a directory inode has
      inheritable properties set, these are added to new inodes created
      under that directory. Further, subvolumes can also have properties
      associated with them, and they can be inherited from their parent
      subvolume. Naturally, directory properties have priority over subvolume
      properties (in practice a subvolume property is just a regular
      property associated with the root inode, objectid 256, of the
      subvolume's fs tree).
      
      This change also adds one specific property implementation, named
      "compression", whose values can be "lzo" or "zlib" and it's an
      inheritable property.
      
      The corresponding changes to btrfs-progs were also implemented.
      A patch with xfstests for this feature will follow once there's
      agreement on this change/feature.
      
      Further, the script at the bottom of this commit message was used to
      do some benchmarks to measure any performance penalties of this feature.
      
      Basically the tests correspond to:
      
      Test 1 - create a filesystem and mount it with compress-force=lzo,
      then sequentially create N files of 64Kb each, measure how long it took
      to create the files, unmount the filesystem, mount the filesystem and
      perform an 'ls -lha' against the test directory holding the N files, and
      report the time the command took.
      
      Test 2 - create a filesystem and don't use any compression option when
      mounting it - instead set the compression property of the subvolume's
      root to 'lzo'. Then create N files of 64Kb, and report the time it took.
      The unmount the filesystem, mount it again and perform an 'ls -lha' like
      in the former test. This means every single file ends up with a property
      (xattr) associated to it.
      
      Test 3 - same as test 2, but uses 4 properties - 3 are duplicates of the
      compression property, have no real effect other than adding more work
      when inheriting properties and taking more btree leaf space.
      
      Test 4 - same as test 3 but with 10 properties per file.
      
      Results (in seconds, and averages of 5 runs each), for different N
      numbers of files follow.
      
      * Without properties (test 1)
      
                          file creation time        ls -lha time
      10 000 files              3.49                   0.76
      100 000 files            47.19                   8.37
      1 000 000 files         518.51                 107.06
      
      * With 1 property (compression property set to lzo - test 2)
      
                          file creation time        ls -lha time
      10 000 files              3.63                    0.93
      100 000 files            48.56                    9.74
      1 000 000 files         537.72                  125.11
      
      * With 4 properties (test 3)
      
                          file creation time        ls -lha time
      10 000 files              3.94                    1.20
      100 000 files            52.14                   11.48
      1 000 000 files         572.70                  142.13
      
      * With 10 properties (test 4)
      
                          file creation time        ls -lha time
      10 000 files              4.61                    1.35
      100 000 files            58.86                   13.83
      1 000 000 files         656.01                  177.61
      
      The increased latencies with properties are essencialy because of:
      
      *) When creating an inode, we now synchronously write 1 more item
         (an xattr item) for each property inherited from the parent dir
         (or subvolume). This could be done in an asynchronous way such
         as we do for dir intex items (delayed-inode.c), which could help
         reduce the file creation latency;
      
      *) With properties, we now have larger fs trees. For this particular
         test each xattr item uses 75 bytes of leaf space in the fs tree.
         This could be less by using a new item for xattr items, instead of
         the current btrfs_dir_item, since we could cut the 'location' and
         'type' fields (saving 18 bytes) and maybe 'transid' too (saving a
         total of 26 bytes per xattr item) from the btrfs_dir_item type.
      
      Also tried batching the xattr insertions (ignoring proper hash
      collision handling, since it didn't exist) when creating files that
      inherit properties from their parent inode/subvolume, but the end
      results were (surprisingly) essentially the same.
      
      Test script:
      
      $ cat test.pl
        #!/usr/bin/perl -w
      
        use strict;
        use Time::HiRes qw(time);
        use constant NUM_FILES => 10_000;
        use constant FILE_SIZES => (64 * 1024);
        use constant DEV => '/dev/sdb4';
        use constant MNT_POINT => '/home/fdmanana/btrfs-tests/dev';
        use constant TEST_DIR => (MNT_POINT . '/testdir');
      
        system("mkfs.btrfs", "-l", "16384", "-f", DEV) == 0 or die "mkfs.btrfs failed!";
      
        # following line for testing without properties
        #system("mount", "-o", "compress-force=lzo", DEV, MNT_POINT) == 0 or die "mount failed!";
      
        # following 2 lines for testing with properties
        system("mount", DEV, MNT_POINT) == 0 or die "mount failed!";
        system("btrfs", "prop", "set", MNT_POINT, "compression", "lzo") == 0 or die "set prop failed!";
      
        system("mkdir", TEST_DIR) == 0 or die "mkdir failed!";
        my ($t1, $t2);
      
        $t1 = time();
        for (my $i = 1; $i <= NUM_FILES; $i++) {
            my $p = TEST_DIR . '/file_' . $i;
            open(my $f, '>', $p) or die "Error opening file!";
            $f->autoflush(1);
            for (my $j = 0; $j < FILE_SIZES; $j += 4096) {
                print $f ('A' x 4096) or die "Error writing to file!";
            }
            close($f);
        }
        $t2 = time();
        print "Time to create " . NUM_FILES . ": " . ($t2 - $t1) . " seconds.\n";
        system("umount", DEV) == 0 or die "umount failed!";
        system("mount", DEV, MNT_POINT) == 0 or die "mount failed!";
      
        $t1 = time();
        system("bash -c 'ls -lha " . TEST_DIR . " > /dev/null'") == 0 or die "ls failed!";
        $t2 = time();
        print "Time to ls -lha all files: " . ($t2 - $t1) . " seconds.\n";
        system("umount", DEV) == 0 or die "umount failed!";
      Signed-off-by: NFilipe David Borba Manana <fdmanana@gmail.com>
      Signed-off-by: NJosef Bacik <jbacik@fb.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      63541927
    • J
      btrfs: add ioctl to export size of global metadata reservation · 01e219e8
      Jeff Mahoney 提交于
      btrfs filesystem df output will show the size of the metadata space
      and how much of it is used, and the user assumes that the difference
      is all usable space. Since that's not actually the case due to the
      global metadata reservation, we should provide the full picture to the
      user.
      
      This patch adds an ioctl that exports the size of the global metadata
      reservation so that btrfs filesystem df can report it.
      Signed-off-by: NJeff Mahoney <jeffm@suse.com>
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      01e219e8
    • J
      btrfs: add ioctls to query/change feature bits online · 2eaa055f
      Jeff Mahoney 提交于
      There are some feature bits that require no offline setup and can
      be enabled online. I've only reviewed extended irefs, but there will
      probably be more.
      
      We introduce three new ioctls:
      - BTRFS_IOC_GET_SUPPORTED_FEATURES: query the kernel for supported features.
      - BTRFS_IOC_GET_FEATURES: query the kernel for enabled features on a per-fs
        basis, as well as querying for which features are changeable with mounted.
      - BTRFS_IOC_SET_FEATURES: change features on a per-fs basis.
      
      We introduce two new masks per feature set (_SAFE_SET and _SAFE_CLEAR) that
      allow us to define which features are safe to change at runtime.
      
      The failure modes for BTRFS_IOC_SET_FEATURES are as follows:
      - Enabling a completely unsupported feature: warns and returns -ENOTSUPP
      - Enabling a feature that can only be done offline: warns and returns -EPERM
      Signed-off-by: NJeff Mahoney <jeffm@suse.com>
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      2eaa055f
  13. 28 1月, 2014 1 次提交
  14. 24 1月, 2014 2 次提交