1. 03 8月, 2006 7 次提交
  2. 22 7月, 2006 1 次提交
  3. 18 7月, 2006 1 次提交
  4. 14 7月, 2006 1 次提交
    • H
      [NET]: Update frag_list in pskb_trim · 27b437c8
      Herbert Xu 提交于
      When pskb_trim has to defer to ___pksb_trim to trim the frag_list part of
      the packet, the frag_list is not updated to reflect the trimming.  This
      will usually work fine until you hit something that uses the packet length
      or tail from the frag_list.
      
      Examples include esp_output and ip_fragment.
      
      Another problem caused by this is that you can end up with a linear packet
      with a frag_list attached.
      
      It is possible to get away with this if we audit everything to make sure
      that they always consult skb->len before going down onto frag_list.  In
      fact we can do the samething for the paged part as well to avoid copying
      the data area of the skb.  For now though, let's do the conservative fix
      and update frag_list.
      
      Many thanks to Marco Berizzi for helping me to track down this bug.
      
      This 4-year old bug took 3 months to track down.  Marco was very patient
      indeed :)
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      27b437c8
  5. 13 7月, 2006 1 次提交
  6. 09 7月, 2006 1 次提交
  7. 08 7月, 2006 1 次提交
  8. 04 7月, 2006 3 次提交
  9. 01 7月, 2006 1 次提交
  10. 30 6月, 2006 5 次提交
    • A
      [NET]: make skb_release_data() static · 5bba1712
      Adrian Bunk 提交于
      skb_release_data() no longer has any users in other files.
      Signed-off-by: NAdrian Bunk <bunk@stusta.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5bba1712
    • C
      [AF_UNIX]: Datagram getpeersec · 877ce7c1
      Catherine Zhang 提交于
      This patch implements an API whereby an application can determine the
      label of its peer's Unix datagram sockets via the auxiliary data mechanism of
      recvmsg.
      
      Patch purpose:
      
      This patch enables a security-aware application to retrieve the
      security context of the peer of a Unix datagram socket.  The application
      can then use this security context to determine the security context for
      processing on behalf of the peer who sent the packet.
      
      Patch design and implementation:
      
      The design and implementation is very similar to the UDP case for INET
      sockets.  Basically we build upon the existing Unix domain socket API for
      retrieving user credentials.  Linux offers the API for obtaining user
      credentials via ancillary messages (i.e., out of band/control messages
      that are bundled together with a normal message).  To retrieve the security
      context, the application first indicates to the kernel such desire by
      setting the SO_PASSSEC option via getsockopt.  Then the application
      retrieves the security context using the auxiliary data mechanism.
      
      An example server application for Unix datagram socket should look like this:
      
      toggle = 1;
      toggle_len = sizeof(toggle);
      
      setsockopt(sockfd, SOL_SOCKET, SO_PASSSEC, &toggle, &toggle_len);
      recvmsg(sockfd, &msg_hdr, 0);
      if (msg_hdr.msg_controllen > sizeof(struct cmsghdr)) {
          cmsg_hdr = CMSG_FIRSTHDR(&msg_hdr);
          if (cmsg_hdr->cmsg_len <= CMSG_LEN(sizeof(scontext)) &&
              cmsg_hdr->cmsg_level == SOL_SOCKET &&
              cmsg_hdr->cmsg_type == SCM_SECURITY) {
              memcpy(&scontext, CMSG_DATA(cmsg_hdr), sizeof(scontext));
          }
      }
      
      sock_setsockopt is enhanced with a new socket option SOCK_PASSSEC to allow
      a server socket to receive security context of the peer.
      
      Testing:
      
      We have tested the patch by setting up Unix datagram client and server
      applications.  We verified that the server can retrieve the security context
      using the auxiliary data mechanism of recvmsg.
      Signed-off-by: NCatherine Zhang <cxzhang@watson.ibm.com>
      Acked-by: NAcked-by: James Morris <jmorris@namei.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      877ce7c1
    • H
      [NET]: Make illegal_highdma more anal · 3d3a8533
      Herbert Xu 提交于
      Rather than having illegal_highdma as a macro when HIGHMEM is off, we
      can turn it into an inline function that returns zero.  This will catch
      callers that give it bad arguments.
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3d3a8533
    • D
      [NETLINK]: Encapsulate eff_cap usage within security framework. · c7bdb545
      Darrel Goeddel 提交于
      This patch encapsulates the usage of eff_cap (in netlink_skb_params) within
      the security framework by extending security_netlink_recv to include a required
      capability parameter and converting all direct usage of eff_caps outside
      of the lsm modules to use the interface.  It also updates the SELinux
      implementation of the security_netlink_send and security_netlink_recv
      hooks to take advantage of the sid in the netlink_skb_params struct.
      This also enables SELinux to perform auditing of netlink capability checks.
      Please apply, for 2.6.18 if possible.
      Signed-off-by: NDarrel Goeddel <dgoeddel@trustedcs.com>
      Signed-off-by: NStephen Smalley <sds@tycho.nsa.gov>
      Acked-by: NJames Morris <jmorris@namei.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c7bdb545
    • H
      [NET]: Added GSO header verification · 576a30eb
      Herbert Xu 提交于
      When GSO packets come from an untrusted source (e.g., a Xen guest domain),
      we need to verify the header integrity before passing it to the hardware.
      
      Since the first step in GSO is to verify the header, we can reuse that
      code by adding a new bit to gso_type: SKB_GSO_DODGY.  Packets with this
      bit set can only be fed directly to devices with the corresponding bit
      NETIF_F_GSO_ROBUST.  If the device doesn't have that bit, then the skb
      is fed to the GSO engine which will allow the packet to be sent to the
      hardware if it passes the header check.
      
      This patch changes the sg flag to a full features flag.  The same method
      can be used to implement TSO ECN support.  We simply have to mark packets
      with CWR set with SKB_GSO_ECN so that only hardware with a corresponding
      NETIF_F_TSO_ECN can accept them.  The GSO engine can either fully segment
      the packet, or segment the first MTU and pass the rest to the hardware for
      further segmentation.
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      576a30eb
  11. 26 6月, 2006 5 次提交
  12. 23 6月, 2006 8 次提交
    • O
      [PATCH] list: use list_replace_init() instead of list_splice_init() · 626ab0e6
      Oleg Nesterov 提交于
      list_splice_init(list, head) does unneeded job if it is known that
      list_empty(head) == 1.  We can use list_replace_init() instead.
      Signed-off-by: NOleg Nesterov <oleg@tv-sign.ru>
      Acked-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      626ab0e6
    • R
      [NET]: fix net-core kernel-doc · f4b8ea78
      Randy Dunlap 提交于
      Warning(/var/linsrc/linux-2617-g4//include/linux/skbuff.h:304): No description found for parameter 'dma_cookie'
      Warning(/var/linsrc/linux-2617-g4//include/net/sock.h:1274): No description found for parameter 'copied_early'
      Warning(/var/linsrc/linux-2617-g4//net/core/dev.c:3309): No description found for parameter 'chan'
      Warning(/var/linsrc/linux-2617-g4//net/core/dev.c:3309): No description found for parameter 'event'
      Signed-off-by: NRandy Dunlap <rdunlap@xenotime.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f4b8ea78
    • H
      [NET]: Added GSO toggle · 37c3185a
      Herbert Xu 提交于
      This patch adds a generic segmentation offload toggle that can be turned
      on/off for each net device.  For now it only supports in TCPv4.
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      37c3185a
    • H
      [NET]: Add software TSOv4 · f4c50d99
      Herbert Xu 提交于
      This patch adds the GSO implementation for IPv4 TCP.
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f4c50d99
    • H
      [NET]: Add generic segmentation offload · f6a78bfc
      Herbert Xu 提交于
      This patch adds the infrastructure for generic segmentation offload.
      The idea is to tap into the potential savings of TSO without hardware
      support by postponing the allocation of segmented skb's until just
      before the entry point into the NIC driver.
      
      The same structure can be used to support software IPv6 TSO, as well as
      UFO and segmentation offload for other relevant protocols, e.g., DCCP.
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f6a78bfc
    • H
      [NET]: Merge TSO/UFO fields in sk_buff · 7967168c
      Herbert Xu 提交于
      Having separate fields in sk_buff for TSO/UFO (tso_size/ufo_size) is not
      going to scale if we add any more segmentation methods (e.g., DCCP).  So
      let's merge them.
      
      They were used to tell the protocol of a packet.  This function has been
      subsumed by the new gso_type field.  This is essentially a set of netdev
      feature bits (shifted by 16 bits) that are required to process a specific
      skb.  As such it's easy to tell whether a given device can process a GSO
      skb: you just have to and the gso_type field and the netdev's features
      field.
      
      I've made gso_type a conjunction.  The idea is that you have a base type
      (e.g., SKB_GSO_TCPV4) that can be modified further to support new features.
      For example, if we add a hardware TSO type that supports ECN, they would
      declare NETIF_F_TSO | NETIF_F_TSO_ECN.  All TSO packets with CWR set would
      have a gso_type of SKB_GSO_TCPV4 | SKB_GSO_TCPV4_ECN while all other TSO
      packets would be SKB_GSO_TCPV4.  This means that only the CWR packets need
      to be emulated in software.
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7967168c
    • H
      [NET]: Prevent transmission after dev_deactivate · d4828d85
      Herbert Xu 提交于
      The dev_deactivate function has bit-rotted since the introduction of
      lockless drivers.  In particular, the spin_unlock_wait call at the end
      has no effect on the xmit routine of lockless drivers.
      
      With a little bit of work, we can make it much more useful by providing
      the guarantee that when it returns, no more calls to the xmit routine
      of the underlying driver will be made.
      
      The idea is simple.  There are two entry points in to the xmit routine.
      The first comes from dev_queue_xmit.  That one is easily stopped by
      using synchronize_rcu.  This works because we set the qdisc to noop_qdisc
      before the synchronize_rcu call.  That in turn causes all subsequent
      packets sent to dev_queue_xmit to be dropped.  The synchronize_rcu call
      also ensures all outstanding calls leave their critical section.
      
      The other entry point is from qdisc_run.  Since we now have a bit that
      indicates whether it's running, all we have to do is to wait until the
      bit is off.
      
      I've removed the loop to wait for __LINK_STATE_SCHED to clear.  This is
      useless because netif_wake_queue can cause it to be set again.  It is
      also harmless because we've disarmed qdisc_run.
      
      I've also removed the spin_unlock_wait on xmit_lock because its only
      purpose of making sure that all outstanding xmit_lock holders have
      exited is also given by dev_watchdog_down.
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d4828d85
    • H
      [NET]: Avoid allocating skb in skb_pad · 5b057c6b
      Herbert Xu 提交于
      First of all it is unnecessary to allocate a new skb in skb_pad since
      the existing one is not shared.  More importantly, our hard_start_xmit
      interface does not allow a new skb to be allocated since that breaks
      requeueing.
      
      This patch uses pskb_expand_head to expand the existing skb and linearize
      it if needed.  Actually, someone should sift through every instance of
      skb_pad on a non-linear skb as they do not fit the reasons why this was
      originally created.
      
      Incidentally, this fixes a minor bug when the skb is cloned (tcpdump,
      TCP, etc.).  As it is skb_pad will simply write over a cloned skb.  Because
      of the position of the write it is unlikely to cause problems but still
      it's best if we don't do it.
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5b057c6b
  13. 18 6月, 2006 5 次提交
    • H
      [ETHTOOL]: Fix UFO typo · 47552c4e
      Herbert Xu 提交于
      The function ethtool_get_ufo was referring to ETHTOOL_GTSO instead of
      ETHTOOL_GUFO.
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      47552c4e
    • H
      [NET]: Add NETIF_F_GEN_CSUM and NETIF_F_ALL_CSUM · 8648b305
      Herbert Xu 提交于
      The current stack treats NETIF_F_HW_CSUM and NETIF_F_NO_CSUM
      identically so we test for them in quite a few places.  For the sake
      of brevity, I'm adding the macro NETIF_F_GEN_CSUM for these two.  We
      also test the disjunct of NETIF_F_IP_CSUM and the other two in various
      places, for that purpose I've added NETIF_F_ALL_CSUM.
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8648b305
    • H
      [NET]: Warn in __skb_trim if skb is paged · 3cc0e873
      Herbert Xu 提交于
      It's better to warn and fail rather than rarely triggering BUG on paths
      that incorrectly call skb_trim/__skb_trim on a non-linear skb.
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3cc0e873
    • H
      [NET]: Clean up skb_linearize · 364c6bad
      Herbert Xu 提交于
      The linearisation operation doesn't need to be super-optimised.  So we can
      replace __skb_linearize with __pskb_pull_tail which does the same thing but
      is more general.
      
      Also, most users of skb_linearize end up testing whether the skb is linear
      or not so it helps to make skb_linearize do just that.
      
      Some callers of skb_linearize also use it to copy cloned data, so it's
      useful to have a new function skb_linearize_cow to copy the data if it's
      either non-linear or cloned.
      
      Last but not least, I've removed the gfp argument since nobody uses it
      anymore.  If it's ever needed we can easily add it back.
      
      Misc bugs fixed by this patch:
      
      * via-velocity error handling (also, no SG => no frags)
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      364c6bad
    • H
      [NET]: Add netif_tx_lock · 932ff279
      Herbert Xu 提交于
      Various drivers use xmit_lock internally to synchronise with their
      transmission routines.  They do so without setting xmit_lock_owner.
      This is fine as long as netpoll is not in use.
      
      With netpoll it is possible for deadlocks to occur if xmit_lock_owner
      isn't set.  This is because if a printk occurs while xmit_lock is held
      and xmit_lock_owner is not set can cause netpoll to attempt to take
      xmit_lock recursively.
      
      While it is possible to resolve this by getting netpoll to use
      trylock, it is suboptimal because netpoll's sole objective is to
      maximise the chance of getting the printk out on the wire.  So
      delaying or dropping the message is to be avoided as much as possible.
      
      So the only alternative is to always set xmit_lock_owner.  The
      following patch does this by introducing the netif_tx_lock family of
      functions that take care of setting/unsetting xmit_lock_owner.
      
      I renamed xmit_lock to _xmit_lock to indicate that it should not be
      used directly.  I didn't provide irq versions of the netif_tx_lock
      functions since xmit_lock is meant to be a BH-disabling lock.
      
      This is pretty much a straight text substitution except for a small
      bug fix in winbond.  It currently uses
      netif_stop_queue/spin_unlock_wait to stop transmission.  This is
      unsafe as an IRQ can potentially wake up the queue.  So it is safer to
      use netif_tx_disable.
      
      The hamradio bits used spin_lock_irq but it is unnecessary as
      xmit_lock must never be taken in an IRQ handler.
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      932ff279