1. 05 3月, 2011 1 次提交
    • E
      inetpeer: seqlock optimization · 65e8354e
      Eric Dumazet 提交于
      David noticed :
      
      ------------------
      Eric, I was profiling the non-routing-cache case and something that
      stuck out is the case of calling inet_getpeer() with create==0.
      
      If an entry is not found, we have to redo the lookup under a spinlock
      to make certain that a concurrent writer rebalancing the tree does
      not "hide" an existing entry from us.
      
      This makes the case of a create==0 lookup for a not-present entry
      really expensive.  It is on the order of 600 cpu cycles on my
      Niagara2.
      
      I added a hack to not do the relookup under the lock when create==0
      and it now costs less than 300 cycles.
      
      This is now a pretty common operation with the way we handle COW'd
      metrics, so I think it's definitely worth optimizing.
      -----------------
      
      One solution is to use a seqlock instead of a spinlock to protect struct
      inet_peer_base.
      
      After a failed avl tree lookup, we can easily detect if a writer did
      some changes during our lookup. Taking the lock and redo the lookup is
      only necessary in this case.
      
      Note: Add one private rcu_deref_locked() macro to place in one spot the
      access to spinlock included in seqlock.
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      65e8354e
  2. 04 3月, 2011 7 次提交
    • D
      DNS: Fix a NULL pointer deref when trying to read an error key [CVE-2011-1076] · 1362fa07
      David Howells 提交于
      When a DNS resolver key is instantiated with an error indication, attempts to
      read that key will result in an oops because user_read() is expecting there to
      be a payload - and there isn't one [CVE-2011-1076].
      
      Give the DNS resolver key its own read handler that returns the error cached in
      key->type_data.x[0] as an error rather than crashing.
      
      Also make the kenter() at the beginning of dns_resolver_instantiate() limit the
      amount of data it prints, since the data is not necessarily NUL-terminated.
      
      The buggy code was added in:
      
      	commit 4a2d7892
      	Author: Wang Lei <wang840925@gmail.com>
      	Date:   Wed Aug 11 09:37:58 2010 +0100
      	Subject: DNS: If the DNS server returns an error, allow that to be cached [ver #2]
      
      This can trivially be reproduced by any user with the following program
      compiled with -lkeyutils:
      
      	#include <stdlib.h>
      	#include <keyutils.h>
      	#include <err.h>
      	static char payload[] = "#dnserror=6";
      	int main()
      	{
      		key_serial_t key;
      		key = add_key("dns_resolver", "a", payload, sizeof(payload),
      			      KEY_SPEC_SESSION_KEYRING);
      		if (key == -1)
      			err(1, "add_key");
      		if (keyctl_read(key, NULL, 0) == -1)
      			err(1, "read_key");
      		return 0;
      	}
      
      What should happen is that keyctl_read() reports error 6 (ENXIO) to the user:
      
      	dns-break: read_key: No such device or address
      
      but instead the kernel oopses.
      
      This cannot be reproduced with the 'keyutils add' or 'keyutils padd' commands
      as both of those cut the data down below the NUL termination that must be
      included in the data.  Without this dns_resolver_instantiate() will return
      -EINVAL and the key will not be instantiated such that it can be read.
      
      The oops looks like:
      
      BUG: unable to handle kernel NULL pointer dereference at 0000000000000010
      IP: [<ffffffff811b99f7>] user_read+0x4f/0x8f
      PGD 3bdf8067 PUD 385b9067 PMD 0
      Oops: 0000 [#1] SMP
      last sysfs file: /sys/devices/pci0000:00/0000:00:19.0/irq
      CPU 0
      Modules linked in:
      
      Pid: 2150, comm: dns-break Not tainted 2.6.38-rc7-cachefs+ #468                  /DG965RY
      RIP: 0010:[<ffffffff811b99f7>]  [<ffffffff811b99f7>] user_read+0x4f/0x8f
      RSP: 0018:ffff88003bf47f08  EFLAGS: 00010246
      RAX: 0000000000000001 RBX: ffff88003b5ea378 RCX: ffffffff81972368
      RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff88003b5ea378
      RBP: ffff88003bf47f28 R08: ffff88003be56620 R09: 0000000000000000
      R10: 0000000000000395 R11: 0000000000000002 R12: 0000000000000000
      R13: 0000000000000000 R14: 0000000000000000 R15: ffffffffffffffa1
      FS:  00007feab5751700(0000) GS:ffff88003e000000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 0000000000000010 CR3: 000000003de40000 CR4: 00000000000006f0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      Process dns-break (pid: 2150, threadinfo ffff88003bf46000, task ffff88003be56090)
      Stack:
       ffff88003b5ea378 ffff88003b5ea3a0 0000000000000000 0000000000000000
       ffff88003bf47f68 ffffffff811b708e ffff88003c442bc8 0000000000000000
       00000000004005a0 00007fffba368060 0000000000000000 0000000000000000
      Call Trace:
       [<ffffffff811b708e>] keyctl_read_key+0xac/0xcf
       [<ffffffff811b7c07>] sys_keyctl+0x75/0xb6
       [<ffffffff81001f7b>] system_call_fastpath+0x16/0x1b
      Code: 75 1f 48 83 7b 28 00 75 18 c6 05 58 2b fb 00 01 be bb 00 00 00 48 c7 c7 76 1c 75 81 e8 13 c2 e9 ff 4c 8b b3 e0 00 00 00 4d 85 ed <41> 0f b7 5e 10 74 2d 4d 85 e4 74 28 e8 98 79 ee ff 49 39 dd 48
      RIP  [<ffffffff811b99f7>] user_read+0x4f/0x8f
       RSP <ffff88003bf47f08>
      CR2: 0000000000000010
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Acked-by: NJeff Layton <jlayton@redhat.com>
      cc: Wang Lei <wang840925@gmail.com>
      Signed-off-by: NJames Morris <jmorris@namei.org>
      1362fa07
    • P
      netlink: kill eff_cap from struct netlink_skb_parms · 01a16b21
      Patrick McHardy 提交于
      Netlink message processing in the kernel is synchronous these days,
      capabilities can be checked directly in security_netlink_recv() from
      the current process.
      Signed-off-by: NPatrick McHardy <kaber@trash.net>
      Reviewed-by: NJames Morris <jmorris@namei.org>
      [chrisw: update to include pohmelfs and uvesafb]
      Signed-off-by: NChris Wright <chrisw@sous-sol.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      01a16b21
    • D
      ipv6: Use ERR_CAST in addrconf_dst_alloc. · 29546a64
      David S. Miller 提交于
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      29546a64
    • D
    • E
      net_sched: reduce fifo qdisc size · d276055c
      Eric Dumazet 提交于
      Because of various alignements [SLUB / qdisc], we use 512 bytes of
      memory for one {p|b}fifo qdisc, instead of 256 bytes on 64bit arches and
      192 bytes on 32bit ones.
      
      Move the "u32 limit" inside "struct Qdisc" (no impact on other qdiscs)
      
      Change qdisc_alloc(), first trying a regular allocation before an
      oversized one.
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d276055c
    • P
      netlink: kill loginuid/sessionid/sid members from struct netlink_skb_parms · c53fa1ed
      Patrick McHardy 提交于
      Netlink message processing in the kernel is synchronous these days, the
      session information can be collected when needed.
      Signed-off-by: NPatrick McHardy <kaber@trash.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c53fa1ed
    • D
      ipv4: Fix crash in dst_release when udp_sendmsg route lookup fails. · 06dc94b1
      David S. Miller 提交于
      As reported by Eric:
      
      [11483.697233] IP: [<c12b0638>] dst_release+0x18/0x60
       ...
      [11483.697741] Call Trace:
      [11483.697764]  [<c12fc9d2>] udp_sendmsg+0x282/0x6e0
      [11483.697790]  [<c12a1c01>] ? memcpy_toiovec+0x51/0x70
      [11483.697818]  [<c12dbd90>] ? ip_generic_getfrag+0x0/0xb0
      
      The pointer passed to dst_release() is -EINVAL, that's because
      we leave an error pointer in the local variable "rt" by accident.
      
      NULL it out to fix the bug.
      Reported-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      06dc94b1
  3. 03 3月, 2011 7 次提交
  4. 02 3月, 2011 24 次提交
  5. 01 3月, 2011 1 次提交