1. 29 10月, 2009 2 次提交
  2. 28 10月, 2009 3 次提交
  3. 24 10月, 2009 2 次提交
  4. 23 10月, 2009 1 次提交
  5. 21 10月, 2009 2 次提交
  6. 20 10月, 2009 2 次提交
  7. 19 10月, 2009 4 次提交
  8. 15 10月, 2009 3 次提交
  9. 13 10月, 2009 4 次提交
    • E
      tcp: replace ehash_size by ehash_mask · f373b53b
      Eric Dumazet 提交于
      Storing the mask (size - 1) instead of the size allows fast path to be
      a bit faster.
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f373b53b
    • A
      net: Introduce recvmmsg socket syscall · a2e27255
      Arnaldo Carvalho de Melo 提交于
      Meaning receive multiple messages, reducing the number of syscalls and
      net stack entry/exit operations.
      
      Next patches will introduce mechanisms where protocols that want to
      optimize this operation will provide an unlocked_recvmsg operation.
      
      This takes into account comments made by:
      
      . Paul Moore: sock_recvmsg is called only for the first datagram,
        sock_recvmsg_nosec is used for the rest.
      
      . Caitlin Bestler: recvmmsg now has a struct timespec timeout, that
        works in the same fashion as the ppoll one.
      
        If the underlying protocol returns a datagram with MSG_OOB set, this
        will make recvmmsg return right away with as many datagrams (+ the OOB
        one) it has received so far.
      
      . Rémi Denis-Courmont & Steven Whitehouse: If we receive N < vlen
        datagrams and then recvmsg returns an error, recvmmsg will return
        the successfully received datagrams, store the error and return it
        in the next call.
      
      This paves the way for a subsequent optimization, sk_prot->unlocked_recvmsg,
      where we will be able to acquire the lock only at batch start and end, not at
      every underlying recvmsg call.
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a2e27255
    • N
      net: Generalize socket rx gap / receive queue overflow cmsg · 3b885787
      Neil Horman 提交于
      Create a new socket level option to report number of queue overflows
      
      Recently I augmented the AF_PACKET protocol to report the number of frames lost
      on the socket receive queue between any two enqueued frames.  This value was
      exported via a SOL_PACKET level cmsg.  AFter I completed that work it was
      requested that this feature be generalized so that any datagram oriented socket
      could make use of this option.  As such I've created this patch, It creates a
      new SOL_SOCKET level option called SO_RXQ_OVFL, which when enabled exports a
      SOL_SOCKET level cmsg that reports the nubmer of times the sk_receive_queue
      overflowed between any two given frames.  It also augments the AF_PACKET
      protocol to take advantage of this new feature (as it previously did not touch
      sk->sk_drops, which this patch uses to record the overflow count).  Tested
      successfully by me.
      
      Notes:
      
      1) Unlike my previous patch, this patch simply records the sk_drops value, which
      is not a number of drops between packets, but rather a total number of drops.
      Deltas must be computed in user space.
      
      2) While this patch currently works with datagram oriented protocols, it will
      also be accepted by non-datagram oriented protocols. I'm not sure if thats
      agreeable to everyone, but my argument in favor of doing so is that, for those
      protocols which aren't applicable to this option, sk_drops will always be zero,
      and reporting no drops on a receive queue that isn't used for those
      non-participating protocols seems reasonable to me.  This also saves us having
      to code in a per-protocol opt in mechanism.
      
      3) This applies cleanly to net-next assuming that commit
      97775007 (my af packet cmsg patch) is reverted
      Signed-off-by: NNeil Horman <nhorman@tuxdriver.com>
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3b885787
    • J
      mac80211: document ieee80211_rx() context requirement · d20ef63d
      Johannes Berg 提交于
      ieee80211_rx() must be called with softirqs disabled
      since the networking stack requires this for netif_rx()
      and some code in mac80211 can assume that it can not
      be processing its own tasklet and this call at the same
      time.
      
      It may be possible to remove this requirement after a
      careful audit of mac80211 and doing any needed locking
      improvements in it along with disabling softirqs around
      netif_rx(). An alternative might be to push all packet
      processing to process context in mac80211, instead of
      to the tasklet, and add other synchronisation.
      Signed-off-by: NJohannes Berg <johannes@sipsolutions.net>
      Signed-off-by: NJohn W. Linville <linville@tuxdriver.com>
      d20ef63d
  10. 12 10月, 2009 1 次提交
  11. 08 10月, 2009 3 次提交
    • E
      udp: dynamically size hash tables at boot time · f86dcc5a
      Eric Dumazet 提交于
      UDP_HTABLE_SIZE was initialy defined to 128, which is a bit small for
      several setups.
      
      4000 active UDP sockets -> 32 sockets per chain in average. An
      incoming frame has to lookup all sockets to find best match, so long
      chains hurt latency.
      
      Instead of a fixed size hash table that cant be perfect for every
      needs, let UDP stack choose its table size at boot time like tcp/ip
      route, using alloc_large_system_hash() helper
      
      Add an optional boot parameter, uhash_entries=x so that an admin can
      force a size between 256 and 65536 if needed, like thash_entries and
      rhash_entries.
      
      dmesg logs two new lines :
      [    0.647039] UDP hash table entries: 512 (order: 0, 4096 bytes)
      [    0.647099] UDP Lite hash table entries: 512 (order: 0, 4096 bytes)
      
      Maximal size on 64bit arches would be 65536 slots, ie 1 MBytes for non
      debugging spinlocks.
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f86dcc5a
    • K
      cfg80211: add firmware and hardware version to wiphy · dfce95f5
      Kalle Valo 提交于
      It's useful to provide firmware and hardware version to user space and have a
      generic interface to retrieve them. Users can provide the version information
      in bug reports etc.
      
      Add fields for firmware and hardware version to struct wiphy.
      
      (Dropped nl80211 bits for now and modified remaining bits in favor of
      ethtool. -- JWL)
      
      Cc: Kalle Valo <kalle.valo@nokia.com>
      Signed-off-by: NJohn W. Linville <linville@tuxdriver.com>
      dfce95f5
    • J
      wext: refactor · 3d23e349
      Johannes Berg 提交于
      Refactor wext to
       * split out iwpriv handling
       * split out iwspy handling
       * split out procfs support
       * allow cfg80211 to have wireless extensions compat code
         w/o CONFIG_WIRELESS_EXT
      
      After this, drivers need to
       - select WIRELESS_EXT	- for wext support
       - select WEXT_PRIV	- for iwpriv support
       - select WEXT_SPY	- for iwspy support
      
      except cfg80211 -- which gets new hooks in wext-core.c
      and can then get wext handlers without CONFIG_WIRELESS_EXT.
      
      Wireless extensions procfs support is auto-selected
      based on PROC_FS and anything that requires the wext core
      (i.e. WIRELESS_EXT or CFG80211_WEXT).
      Signed-off-by: NJohannes Berg <johannes@sipsolutions.net>
      Signed-off-by: NJohn W. Linville <linville@tuxdriver.com>
      3d23e349
  12. 07 10月, 2009 4 次提交
    • S
      net: mark net_proto_ops as const · ec1b4cf7
      Stephen Hemminger 提交于
      All usages of structure net_proto_ops should be declared const.
      Signed-off-by: NStephen Hemminger <shemminger@vyatta.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ec1b4cf7
    • E
      pkt_sched: gen_estimator: Dont report fake rate estimators · d250a5f9
      Eric Dumazet 提交于
      Jarek Poplawski a écrit :
      >
      >
      > Hmm... So you made me to do some "real" work here, and guess what?:
      > there is one serious checkpatch warning! ;-) Plus, this new parameter
      > should be added to the function description. Otherwise:
      > Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
      >
      > Thanks,
      > Jarek P.
      >
      > PS: I guess full "Don't" would show we really mean it...
      
      Okay :) Here is the last round, before the night !
      
      Thanks again
      
      [RFC] pkt_sched: gen_estimator: Don't report fake rate estimators
      
      We currently send TCA_STATS_RATE_EST elements to netlink users, even if no estimator
      is running.
      
      # tc -s -d qdisc
      qdisc pfifo_fast 0: dev eth0 root bands 3 priomap  1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
       Sent 112833764978 bytes 1495081739 pkt (dropped 0, overlimits 0 requeues 0)
       rate 0bit 0pps backlog 0b 0p requeues 0
      
      User has no way to tell if the "rate 0bit 0pps" is a real estimation, or a fake
      one (because no estimator is active)
      
      After this patch, tc command output is :
      $ tc -s -d qdisc
      qdisc pfifo_fast 0: dev eth0 root bands 3 priomap  1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
       Sent 561075 bytes 1196 pkt (dropped 0, overlimits 0 requeues 0)
       backlog 0b 0p requeues 0
      
      We add a parameter to gnet_stats_copy_rate_est() function so that
      it can use gen_estimator_active(bstats, r), as suggested by Jarek.
      
      This parameter can be NULL if check is not necessary, (htb for
      example has a mandatory rate estimator)
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NJarek Poplawski <jarkao2@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d250a5f9
    • Y
      ipv6 sit: 6rd (IPv6 Rapid Deployment) Support. · fa857afc
      YOSHIFUJI Hideaki / 吉藤英明 提交于
      IPv6 Rapid Deployment (6rd; draft-ietf-softwire-ipv6-6rd) builds upon
      mechanisms of 6to4 (RFC3056) to enable a service provider to rapidly
      deploy IPv6 unicast service to IPv4 sites to which it provides
      customer premise equipment.  Like 6to4, it utilizes stateless IPv6 in
      IPv4 encapsulation in order to transit IPv4-only network
      infrastructure.  Unlike 6to4, a 6rd service provider uses an IPv6
      prefix of its own in place of the fixed 6to4 prefix.
      
      With this option enabled, the SIT driver offers 6rd functionality by
      providing additional ioctl API to configure the IPv6 Prefix for in
      stead of static 2002::/16 for 6to4.
      
      Original patch was done by Alexandre Cassen <acassen@freebox.fr>
      based on old Internet-Draft.
      Signed-off-by: NYOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fa857afc
    • E
      net: speedup sk_wake_async() · bcdce719
      Eric Dumazet 提交于
      An incoming datagram must bring into cpu cache *lot* of cache lines,
      in particular : (other parts omitted (hash chains, ip route cache...))
      
      On 32bit arches :
      
      offsetof(struct sock, sk_rcvbuf)       =0x30    (read)
      offsetof(struct sock, sk_lock)         =0x34   (rw)
      
      offsetof(struct sock, sk_sleep)        =0x50 (read)
      offsetof(struct sock, sk_rmem_alloc)   =0x64   (rw)
      offsetof(struct sock, sk_receive_queue)=0x74   (rw)
      
      offsetof(struct sock, sk_forward_alloc)=0x98   (rw)
      
      offsetof(struct sock, sk_callback_lock)=0xcc    (rw)
      offsetof(struct sock, sk_drops)        =0xd8 (read if we add dropcount support, rw if frame dropped)
      offsetof(struct sock, sk_filter)       =0xf8    (read)
      
      offsetof(struct sock, sk_socket)       =0x138 (read)
      
      offsetof(struct sock, sk_data_ready)   =0x15c   (read)
      
      
      We can avoid sk->sk_socket and socket->fasync_list referencing on sockets
      with no fasync() structures. (socket->fasync_list ptr is probably already in cache
      because it shares a cache line with socket->wait, ie location pointed by sk->sk_sleep)
      
      This avoids one cache line load per incoming packet for common cases (no fasync())
      
      We can leave (or even move in a future patch) sk->sk_socket in a cold location
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      bcdce719
  13. 05 10月, 2009 2 次提交
  14. 01 10月, 2009 1 次提交
  15. 29 9月, 2009 1 次提交
  16. 27 9月, 2009 1 次提交
  17. 25 9月, 2009 1 次提交
  18. 24 9月, 2009 2 次提交
    • A
      sysctl: remove "struct file *" argument of ->proc_handler · 8d65af78
      Alexey Dobriyan 提交于
      It's unused.
      
      It isn't needed -- read or write flag is already passed and sysctl
      shouldn't care about the rest.
      
      It _was_ used in two places at arch/frv for some reason.
      Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: James Morris <jmorris@namei.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      8d65af78
    • A
      9p: Add fscache support to 9p · 60e78d2c
      Abhishek Kulkarni 提交于
      This patch adds a persistent, read-only caching facility for
      9p clients using the FS-Cache caching backend.
      
      When the fscache facility is enabled, each inode is associated
      with a corresponding vcookie which is an index into the FS-Cache
      indexing tree. The FS-Cache indexing tree is indexed at 3 levels:
      - session object associated with each mount.
      - inode/vcookie
      - actual data (pages)
      
      A cache tag is chosen randomly for each session. These tags can
      be read off /sys/fs/9p/caches and can be passed as a mount-time
      parameter to re-attach to the specified caching session.
      Signed-off-by: NAbhishek Kulkarni <adkulkar@umail.iu.edu>
      Signed-off-by: NEric Van Hensbergen <ericvh@gmail.com>
      60e78d2c
  19. 15 9月, 2009 1 次提交
    • J
      pkt_sched: Fix tx queue selection in tc_modify_qdisc · 926e61b7
      Jarek Poplawski 提交于
      After the recent mq change there is the new select_queue qdisc class
      method used in tc_modify_qdisc, but it works OK only for direct child
      qdiscs of mq qdisc. Grandchildren always get the first tx queue, which
      would give wrong qdisc_root etc. results (e.g. for sch_htb as child of
      sch_prio). This patch fixes it by using parent's dev_queue for such
      grandchildren qdiscs. The select_queue method's return type is changed
      BTW.
      
      With feedback from: Patrick McHardy <kaber@trash.net>
      Signed-off-by: NJarek Poplawski <jarkao2@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      926e61b7