1. 28 9月, 2010 5 次提交
  2. 27 9月, 2010 5 次提交
  3. 25 9月, 2010 2 次提交
    • E
      net: fix a lockdep splat · f064af1e
      Eric Dumazet 提交于
      We have for each socket :
      
      One spinlock (sk_slock.slock)
      One rwlock (sk_callback_lock)
      
      Possible scenarios are :
      
      (A) (this is used in net/sunrpc/xprtsock.c)
      read_lock(&sk->sk_callback_lock) (without blocking BH)
      <BH>
      spin_lock(&sk->sk_slock.slock);
      ...
      read_lock(&sk->sk_callback_lock);
      ...
      
      (B)
      write_lock_bh(&sk->sk_callback_lock)
      stuff
      write_unlock_bh(&sk->sk_callback_lock)
      
      (C)
      spin_lock_bh(&sk->sk_slock)
      ...
      write_lock_bh(&sk->sk_callback_lock)
      stuff
      write_unlock_bh(&sk->sk_callback_lock)
      spin_unlock_bh(&sk->sk_slock)
      
      This (C) case conflicts with (A) :
      
      CPU1 [A]                         CPU2 [C]
      read_lock(callback_lock)
      <BH>                             spin_lock_bh(slock)
      <wait to spin_lock(slock)>
                                       <wait to write_lock_bh(callback_lock)>
      
      We have one problematic (C) use case in inet_csk_listen_stop() :
      
      local_bh_disable();
      bh_lock_sock(child); // spin_lock_bh(&sk->sk_slock)
      WARN_ON(sock_owned_by_user(child));
      ...
      sock_orphan(child); // write_lock_bh(&sk->sk_callback_lock)
      
      lockdep is not happy with this, as reported by Tetsuo Handa
      
      It seems only way to deal with this is to use read_lock_bh(callbacklock)
      everywhere.
      
      Thanks to Jarek for pointing a bug in my first attempt and suggesting
      this solution.
      Reported-by: NTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Tested-by: NTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      CC: Jarek Poplawski <jarkao2@gmail.com>
      Tested-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f064af1e
    • E
      ip: take care of last fragment in ip_append_data · 59104f06
      Eric Dumazet 提交于
      While investigating a bit, I found ip_fragment() slow path was taken
      because ip_append_data() provides following layout for a send(MTU +
      N*(MTU - 20)) syscall :
      
      - one skb with 1500 (mtu) bytes
      - N fragments of 1480 (mtu-20) bytes (before adding IP header)
      last fragment gets 17 bytes of trail data because of following bit:
      
      	if (datalen == length + fraggap)
      		alloclen += rt->dst.trailer_len;
      
      Then esp4 adds 16 bytes of data (while trailer_len is 17... hmm...
      another bug ?)
      
      In ip_fragment(), we notice last fragment is too big (1496 + 20) > mtu,
      so we take slow path, building another skb chain.
      
      In order to avoid taking slow path, we should correct ip_append_data()
      to make sure last fragment has real trail space, under mtu...
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      59104f06
  4. 24 9月, 2010 1 次提交
  5. 23 9月, 2010 9 次提交
  6. 22 9月, 2010 9 次提交
  7. 21 9月, 2010 9 次提交