1. 21 6月, 2008 3 次提交
    • E
      netns: Don't receive new packets in a dead network namespace. · b9f75f45
      Eric W. Biederman 提交于
      Alexey Dobriyan <adobriyan@gmail.com> writes:
      > Subject: ICMP sockets destruction vs ICMP packets oops
      
      > After icmp_sk_exit() nuked ICMP sockets, we get an interrupt.
      > icmp_reply() wants ICMP socket.
      >
      > Steps to reproduce:
      >
      > 	launch shell in new netns
      > 	move real NIC to netns
      > 	setup routing
      > 	ping -i 0
      > 	exit from shell
      >
      > BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
      > IP: [<ffffffff803fce17>] icmp_sk+0x17/0x30
      > PGD 17f3cd067 PUD 17f3ce067 PMD 0 
      > Oops: 0000 [1] PREEMPT SMP DEBUG_PAGEALLOC
      > CPU 0 
      > Modules linked in: usblp usbcore
      > Pid: 0, comm: swapper Not tainted 2.6.26-rc6-netns-ct #4
      > RIP: 0010:[<ffffffff803fce17>]  [<ffffffff803fce17>] icmp_sk+0x17/0x30
      > RSP: 0018:ffffffff8057fc30  EFLAGS: 00010286
      > RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff81017c7db900
      > RDX: 0000000000000034 RSI: ffff81017c7db900 RDI: ffff81017dc41800
      > RBP: ffffffff8057fc40 R08: 0000000000000001 R09: 000000000000a815
      > R10: 0000000000000000 R11: 0000000000000001 R12: ffffffff8057fd28
      > R13: ffffffff8057fd00 R14: ffff81017c7db938 R15: ffff81017dc41800
      > FS:  0000000000000000(0000) GS:ffffffff80525000(0000) knlGS:0000000000000000
      > CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
      > CR2: 0000000000000000 CR3: 000000017fcda000 CR4: 00000000000006e0
      > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      > DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      > Process swapper (pid: 0, threadinfo ffffffff8053a000, task ffffffff804fa4a0)
      > Stack:  0000000000000000 ffff81017c7db900 ffffffff8057fcf0 ffffffff803fcfe4
      >  ffffffff804faa38 0000000000000246 0000000000005a40 0000000000000246
      >  000000000001ffff ffff81017dd68dc0 0000000000005a40 0000000055342436
      > Call Trace:
      >  <IRQ>  [<ffffffff803fcfe4>] icmp_reply+0x44/0x1e0
      >  [<ffffffff803d3a0a>] ? ip_route_input+0x23a/0x1360
      >  [<ffffffff803fd645>] icmp_echo+0x65/0x70
      >  [<ffffffff803fd300>] icmp_rcv+0x180/0x1b0
      >  [<ffffffff803d6d84>] ip_local_deliver+0xf4/0x1f0
      >  [<ffffffff803d71bb>] ip_rcv+0x33b/0x650
      >  [<ffffffff803bb16a>] netif_receive_skb+0x27a/0x340
      >  [<ffffffff803be57d>] process_backlog+0x9d/0x100
      >  [<ffffffff803bdd4d>] net_rx_action+0x18d/0x250
      >  [<ffffffff80237be5>] __do_softirq+0x75/0x100
      >  [<ffffffff8020c97c>] call_softirq+0x1c/0x30
      >  [<ffffffff8020f085>] do_softirq+0x65/0xa0
      >  [<ffffffff80237af7>] irq_exit+0x97/0xa0
      >  [<ffffffff8020f198>] do_IRQ+0xa8/0x130
      >  [<ffffffff80212ee0>] ? mwait_idle+0x0/0x60
      >  [<ffffffff8020bc46>] ret_from_intr+0x0/0xf
      >  <EOI>  [<ffffffff80212f2c>] ? mwait_idle+0x4c/0x60
      >  [<ffffffff80212f23>] ? mwait_idle+0x43/0x60
      >  [<ffffffff8020a217>] ? cpu_idle+0x57/0xa0
      >  [<ffffffff8040f380>] ? rest_init+0x70/0x80
      > Code: 10 5b 41 5c 41 5d 41 5e c9 c3 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 53
      > 48 83 ec 08 48 8b 9f 78 01 00 00 e8 2b c7 f1 ff 89 c0 <48> 8b 04 c3 48 83 c4 08
      > 5b c9 c3 66 66 66 66 66 2e 0f 1f 84 00
      > RIP  [<ffffffff803fce17>] icmp_sk+0x17/0x30
      >  RSP <ffffffff8057fc30>
      > CR2: 0000000000000000
      > ---[ end trace ea161157b76b33e8 ]---
      > Kernel panic - not syncing: Aiee, killing interrupt handler!
      
      Receiving packets while we are cleaning up a network namespace is a
      racy proposition. It is possible when the packet arrives that we have
      removed some but not all of the state we need to fully process it.  We
      have the choice of either playing wack-a-mole with the cleanup routines
      or simply dropping packets when we don't have a network namespace to
      handle them.
      
      Since the check looks inexpensive in netif_receive_skb let's just
      drop the incoming packets.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b9f75f45
    • D
      sctp: Make sure N * sizeof(union sctp_addr) does not overflow. · 735ce972
      David S. Miller 提交于
      As noticed by Gabriel Campana, the kmalloc() length arg
      passed in by sctp_getsockopt_local_addrs_old() can overflow
      if ->addr_num is large enough.
      
      Therefore, enforce an appropriate limit.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      735ce972
    • S
      pppoe: warning fix · 2645a3c3
      Stephen Hemminger 提交于
      Fix warning:
      drivers/net/pppoe.c: In function 'pppoe_recvmsg':
      drivers/net/pppoe.c:945: warning: comparison of distinct pointer types lacks a cast
      because skb->len is unsigned int and total_len is size_t
      Signed-off-by: NStephen Hemminger <shemminger@vyatta.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2645a3c3
  2. 20 6月, 2008 2 次提交
  3. 19 6月, 2008 1 次提交
  4. 18 6月, 2008 18 次提交
    • P
      netlink: genl: fix circular locking · 6d1a3fb5
      Patrick McHardy 提交于
      genetlink has a circular locking dependency when dumping the registered
      families:
      
      - dump start:
      genl_rcv()            : take genl_mutex
      genl_rcv_msg()        : call netlink_dump_start() while holding genl_mutex
      netlink_dump_start(),
      netlink_dump()        : take nlk->cb_mutex
      ctrl_dumpfamily()     : try to detect this case and not take genl_mutex a
                              second time
      
      - dump continuance:
      netlink_rcv()         : call netlink_dump
      netlink_dump          : take nlk->cb_mutex
      ctrl_dumpfamily()     : take genl_mutex
      
      Register genl_lock as callback mutex with netlink to fix this. This slightly
      widens an already existing module unload race, the genl ops used during the
      dump might go away when the module is unloaded. Thomas Graf is working on a
      seperate fix for this.
      Signed-off-by: NPatrick McHardy <kaber@trash.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6d1a3fb5
    • D
      Revert "mac80211: Use skb_header_cloned() on TX path." · 3a5be7d4
      David S. Miller 提交于
      This reverts commit 608961a5.
      
      The problem is that the mac80211 stack not only needs to be able to
      muck with the link-level headers, it also might need to mangle all of
      the packet data if doing sw wireless encryption.
      
      This fixes kernel bugzilla #10903.  Thanks to Didier Raboud (for the
      bugzilla report), Andrew Prince (for bisecting), Johannes Berg (for
      bringing this bisection analysis to my attention), and Ilpo (for
      trying to analyze this purely from the TCP side).
      
      In 2.6.27 we can take another stab at this, by using something like
      skb_cow_data() when the TX path of mac80211 ends up with a non-NULL
      tx->key.  The ESP protocol code in the IPSEC stack can be used as a
      model for implementation.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3a5be7d4
    • R
      af_unix: fix 'poll for write'/ connected DGRAM sockets · 3c73419c
      Rainer Weikusat 提交于
      The unix_dgram_sendmsg routine implements a (somewhat crude)
      form of receiver-imposed flow control by comparing the length of the
      receive queue of the 'peer socket' with the max_ack_backlog value
      stored in the corresponding sock structure, either blocking
      the thread which caused the send-routine to be called or returning
      EAGAIN. This routine is used by both SOCK_DGRAM and SOCK_SEQPACKET
      sockets. The poll-implementation for these socket types is
      datagram_poll from core/datagram.c. A socket is deemed to be writeable
      by this routine when the memory presently consumed by datagrams
      owned by it is less than the configured socket send buffer size. This
      is always wrong for connected PF_UNIX non-stream sockets when the
      abovementioned receive queue is currently considered to be full.
      'poll' will then return, indicating that the socket is writeable, but
      a subsequent write result in EAGAIN, effectively causing an
      (usual) application to 'poll for writeability by repeated send request
      with O_NONBLOCK set' until it has consumed its time quantum.
      
      The change below uses a suitably modified variant of the datagram_poll
      routines for both type of PF_UNIX sockets, which tests if the
      recv-queue of the peer a socket is connected to is presently
      considered to be 'full' as part of the 'is this socket
      writeable'-checking code. The socket being polled is additionally
      put onto the peer_wait wait queue associated with its peer, because the
      unix_dgram_sendmsg routine does a wake up on this queue after a
      datagram was received and the 'other wakeup call' is done implicitly
      as part of skb destruction, meaning, a process blocked in poll
      because of a full peer receive queue could otherwise sleep forever
      if no datagram owned by its socket was already sitting on this queue.
      Among this change is a small (inline) helper routine named
      'unix_recvq_full', which consolidates the actual testing code (in three
      different places) into a single location.
      Signed-off-by: NRainer Weikusat <rweikusat@mssgmbh.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3c73419c
    • D
    • A
      tun: Proper handling of IPv6 header in tun driver when TUN_NO_PI is set · f09f7ee2
      Ang Way Chuang 提交于
      By default, tun.c running in TUN_TUN_DEV mode will set the protocol of
      packet to IPv4 if TUN_NO_PI is set. My program failed to work when I
      assumed that the driver will check the first nibble of packet,
      determine IP version and set the appropriate protocol.
      Signed-off-by: NAng Way Chuang <wcang@nav6.org>
      Acked-by: NMax Krasnyansky <maxk@qualcomm.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f09f7ee2
    • R
      atl1: relax eeprom mac address error check · 58c7821c
      Radu Cristescu 提交于
      The atl1 driver tries to determine the MAC address thusly:
      
      	- If an EEPROM exists, read the MAC address from EEPROM and
      	  validate it.
      	- If an EEPROM doesn't exist, try to read a MAC address from
      	  SPI flash.
      	- If that fails, try to read a MAC address directly from the
      	  MAC Station Address register.
      	- If that fails, assign a random MAC address provided by the
      	  kernel.
      
      We now have a report of a system fitted with an EEPROM containing all
      zeros where we expect the MAC address to be, and we currently handle
      this as an error condition.  Turns out, on this system the BIOS writes
      a valid MAC address to the NIC's MAC Station Address register, but we
      never try to read it because we return an error when we find the all-
      zeros address in EEPROM.
      
      This patch relaxes the error check and continues looking for a MAC
      address even if it finds an illegal one in EEPROM.
      Signed-off-by: NRadu Cristescu <advantis@gmx.net>
      Signed-off-by: NJay Cliburn <jacliburn@bellsouth.net>
      Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
      58c7821c
    • D
      net/enc28j60: low power mode · 7dac6f8d
      David Brownell 提交于
      Keep enc28j60 chips in low-power mode when they're not in use.
      At typically 120 mA, these chips run hot even when idle; this
      low power mode cuts that power usage by a factor of around 100.
      
      This version provides a generic routine to poll a register until
      its masked value equals some value ... e.g. bit set or cleared.
      It's basically what the previous wait_phy_ready() did, but this
      version is generalized to support the handshaking needed to
      enter and exit low power mode.
      Signed-off-by: NDavid Brownell <dbrownell@users.sourceforge.net>
      Signed-off-by: NClaudio Lanconelli <lanconelli.claudio@eptar.com>
      Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
      7dac6f8d
    • D
      net/enc28j60: section fix · 6fd65882
      David Brownell 提交于
      Minor bugfixes to the enc28j60 driver ... wrong section marking,
      indentation, and bogus use of spi_bus_type.
      Signed-off-by: NDavid Brownell <dbrownell@users.sourceforge.net>
      Acked-by: NClaudio Lanconelli <lanconelli.claudio@eptar.com>
      Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
      6fd65882
    • S
      sky2: 88E8040T pci device id · a3b4fced
      Stephen Hemminger 提交于
      Missed one pci id for 88E8040T.
      Signed-off-by: NStephen Hemminger <shemminger@vyatta.com>
      Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
      a3b4fced
    • D
      netxen: download firmware in pci probe · 439b454e
      Dhananjay Phadke 提交于
      Downloading firmware in pci probe allows recovery in case of
      firmware failure by reloading the driver.
      
      Also reduced delays in firmware load.
      Signed-off-by: NDhananjay Phadke <dhananjay@netxen.com>
      Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
      439b454e
    • D
      netxen: cleanup debug messages · dcd56fdb
      Dhananjay Phadke 提交于
      o Remove unnecessary debug prints and functions.
      o Explicitly specify pci class (0x020000) to avoid enabling
        management function.
      Signed-off-by: NDhananjay Phadke <dhananjay@netxen.com>
      Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
      dcd56fdb
    • D
      netxen: remove global physical_port array · 3276fbad
      Dhananjay Phadke 提交于
      Store physical port number in netxen_adapter structure.
      Signed-off-by: NDhananjay Phadke <dhananjay@netxen.com>
      Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
      3276fbad
    • D
      netxen: fix portnum for hp mezz cards · dc515f2e
      Dhananjay Phadke 提交于
      This fixes a the issue where logical port number is set incorrectly
      for HP blade mezz cards.
      Signed-off-by: NDhananjay Phadke <dhananjay@netxen.com>
      Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
      dc515f2e
    • J
      ibm_newemac: select CRC32 in Kconfig · 8b8091fb
      Josh Boyer 提交于
      The ibm_newemac driver requires ether_crc to be defined.  Apparently it is
      possible to generate a .config without CONFIG_CRC32 set which causes the
      following link errors if IBM_NEW_EMAC is selected:
      
        LD      .tmp_vmlinux1
      drivers/built-in.o: In function `emac_hash_mc':
      core.c:(.text+0x2f524): undefined reference to `crc32_le'
      core.c:(.text+0x2f528): undefined reference to `bitrev32'
      make: *** [.tmp_vmlinux1] Error 1
      
      This patch has IBM_NEW_EMAC select CRC32 so we don't hit this error.
      Signed-off-by: NJosh Boyer <jwboyer@linux.vnet.ibm.com>
      Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
      8b8091fb
    • S
      xfrm: fix fragmentation for ipv4 xfrm tunnel · fe833fca
      Steffen Klassert 提交于
      When generating the ip header for the transformed packet we just copy
      the frag_off field of the ip header from the original packet to the ip
      header of the new generated packet. If we receive a packet as a chain
      of fragments, all but the last of the new generated packets have the
      IP_MF flag set. We have to mask the frag_off field to only keep the
      IP_DF flag from the original packet. This got lost with git commit
      36cf9acf ("[IPSEC]: Separate
      inner/outer mode processing on output")
      Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>
      Acked-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fe833fca
    • P
      netfilter: nf_conntrack_h323: fix module unload crash · a56b8f81
      Patrick McHardy 提交于
      The H.245 helper is not registered/unregistered, but assigned to
      connections manually from the Q.931 helper. This means on unload
      existing expectations and connections using the helper are not
      cleaned up, leading to the following oops on module unload:
      
      CPU 0 Unable to handle kernel paging request at virtual address c00a6828, epc == 802224dc, ra == 801d4e7c
      Oops[#1]:
      Cpu 0
      $ 0   : 00000000 00000000 00000004 c00a67f0
      $ 4   : 802a5ad0 81657e00 00000000 00000000
      $ 8   : 00000008 801461c8 00000000 80570050
      $12   : 819b0280 819b04b0 00000006 00000000
      $16   : 802a5a60 80000000 80b46000 80321010
      $20   : 00000000 00000004 802a5ad0 00000001
      $24   : 00000000 802257a8
      $28   : 802a4000 802a59e8 00000004 801d4e7c
      Hi    : 0000000b
      Lo    : 00506320
      epc   : 802224dc ip_conntrack_help+0x38/0x74     Tainted: P
      ra    : 801d4e7c nf_iterate+0xbc/0x130
      Status: 1000f403    KERNEL EXL IE
      Cause : 00800008
      BadVA : c00a6828
      PrId  : 00019374
      Modules linked in: ip_nat_pptp ip_conntrack_pptp ath_pktlog wlan_acl wlan_wep wlan_tkip wlan_ccmp wlan_xauth ath_pci ath_dev ath_dfs ath_rate_atheros wlan ath_hal ip_nat_tftp ip_conntrack_tftp ip_nat_ftp ip_conntrack_ftp pppoe ppp_async ppp_deflate ppp_mppe pppox ppp_generic slhc
      Process swapper (pid: 0, threadinfo=802a4000, task=802a6000)
      Stack : 801e7d98 00000004 802a5a60 80000000 801d4e7c 801d4e7c 802a5ad0 00000004
              00000000 00000000 801e7d98 00000000 00000004 802a5ad0 00000000 00000010
              801e7d98 80b46000 802a5a60 80320000 80000000 801d4f8c 802a5b00 00000002
              80063834 00000000 80b46000 802a5a60 801e7d98 80000000 802ba854 00000000
              81a02180 80b7e260 81a021b0 819b0000 819b0000 80570056 00000000 00000001
              ...
      Call Trace:
       [<801e7d98>] ip_finish_output+0x0/0x23c
       [<801d4e7c>] nf_iterate+0xbc/0x130
       [<801d4e7c>] nf_iterate+0xbc/0x130
       [<801e7d98>] ip_finish_output+0x0/0x23c
       [<801e7d98>] ip_finish_output+0x0/0x23c
       [<801d4f8c>] nf_hook_slow+0x9c/0x1a4
      
      One way to fix this would be to split helper cleanup from the unregistration
      function and invoke it for the H.245 helper, but since ctnetlink needs to be
      able to find the helper for synchonization purposes, a better fix is to
      register it normally and make sure its not assigned to connections during
      helper lookup. The missing l3num initialization is enough for this, this
      patch changes it to use AF_UNSPEC to make it more explicit though.
      Reported-by: Nliannan <liannan@twsz.com>
      Signed-off-by: NPatrick McHardy <kaber@trash.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a56b8f81
    • P
      netfilter: nf_conntrack_h323: fix memory leak in module initialization error path · 8a548868
      Patrick McHardy 提交于
      Properly free h323_buffer when helper registration fails.
      Signed-off-by: NPatrick McHardy <kaber@trash.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8a548868
    • P
      netfilter: nf_nat: fix RCU races · 68b80f11
      Patrick McHardy 提交于
      Fix three ct_extend/NAT extension related races:
      
      - When cleaning up the extension area and removing it from the bysource hash,
        the nat->ct pointer must not be set to NULL since it may still be used in
        a RCU read side
      
      - When replacing a NAT extension area in the bysource hash, the nat->ct
        pointer must be assigned before performing the replacement
      
      - When reallocating extension storage in ct_extend, the old memory must
        not be freed immediately since it may still be used by a RCU read side
      
      Possibly fixes https://bugzilla.redhat.com/show_bug.cgi?id=449315
      and/or http://bugzilla.kernel.org/show_bug.cgi?id=10875Signed-off-by: NPatrick McHardy <kaber@trash.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      68b80f11
  5. 17 6月, 2008 16 次提交