1. 20 11月, 2012 1 次提交
  2. 10 10月, 2012 1 次提交
    • J
      RDS: fix rds-ping spinlock recursion · 5175a5e7
      jeff.liu 提交于
      This is the revised patch for fixing rds-ping spinlock recursion
      according to Venkat's suggestions.
      
      RDS ping/pong over TCP feature has been broken for years(2.6.39 to
      3.6.0) since we have to set TCP cork and call kernel_sendmsg() between
      ping/pong which both need to lock "struct sock *sk". However, this
      lock has already been hold before rds_tcp_data_ready() callback is
      triggerred. As a result, we always facing spinlock resursion which
      would resulting in system panic.
      
      Given that RDS ping is only used to test the connectivity and not for
      serious performance measurements, we can queue the pong transmit to
      rds_wq as a delayed response.
      Reported-by: NDan Carpenter <dan.carpenter@oracle.com>
      CC: Venkat Venkatsubra <venkat.x.venkatsubra@oracle.com>
      CC: David S. Miller <davem@davemloft.net>
      CC: James Morris <james.l.morris@oracle.com>
      Signed-off-by: NJie Liu <jeff.liu@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5175a5e7
  3. 23 8月, 2012 1 次提交
  4. 23 7月, 2012 1 次提交
    • W
      rds: set correct msg_namelen · 06b6a1cf
      Weiping Pan 提交于
      Jay Fenlason (fenlason@redhat.com) found a bug,
      that recvfrom() on an RDS socket can return the contents of random kernel
      memory to userspace if it was called with a address length larger than
      sizeof(struct sockaddr_in).
      rds_recvmsg() also fails to set the addr_len paramater properly before
      returning, but that's just a bug.
      There are also a number of cases wher recvfrom() can return an entirely bogus
      address. Anything in rds_recvmsg() that returns a non-negative value but does
      not go through the "sin = (struct sockaddr_in *)msg->msg_name;" code path
      at the end of the while(1) loop will return up to 128 bytes of kernel memory
      to userspace.
      
      And I write two test programs to reproduce this bug, you will see that in
      rds_server, fromAddr will be overwritten and the following sock_fd will be
      destroyed.
      Yes, it is the programmer's fault to set msg_namelen incorrectly, but it is
      better to make the kernel copy the real length of address to user space in
      such case.
      
      How to run the test programs ?
      I test them on 32bit x86 system, 3.5.0-rc7.
      
      1 compile
      gcc -o rds_client rds_client.c
      gcc -o rds_server rds_server.c
      
      2 run ./rds_server on one console
      
      3 run ./rds_client on another console
      
      4 you will see something like:
      server is waiting to receive data...
      old socket fd=3
      server received data from client:data from client
      msg.msg_namelen=32
      new socket fd=-1067277685
      sendmsg()
      : Bad file descriptor
      
      /***************** rds_client.c ********************/
      
      int main(void)
      {
      	int sock_fd;
      	struct sockaddr_in serverAddr;
      	struct sockaddr_in toAddr;
      	char recvBuffer[128] = "data from client";
      	struct msghdr msg;
      	struct iovec iov;
      
      	sock_fd = socket(AF_RDS, SOCK_SEQPACKET, 0);
      	if (sock_fd < 0) {
      		perror("create socket error\n");
      		exit(1);
      	}
      
      	memset(&serverAddr, 0, sizeof(serverAddr));
      	serverAddr.sin_family = AF_INET;
      	serverAddr.sin_addr.s_addr = inet_addr("127.0.0.1");
      	serverAddr.sin_port = htons(4001);
      
      	if (bind(sock_fd, (struct sockaddr*)&serverAddr, sizeof(serverAddr)) < 0) {
      		perror("bind() error\n");
      		close(sock_fd);
      		exit(1);
      	}
      
      	memset(&toAddr, 0, sizeof(toAddr));
      	toAddr.sin_family = AF_INET;
      	toAddr.sin_addr.s_addr = inet_addr("127.0.0.1");
      	toAddr.sin_port = htons(4000);
      	msg.msg_name = &toAddr;
      	msg.msg_namelen = sizeof(toAddr);
      	msg.msg_iov = &iov;
      	msg.msg_iovlen = 1;
      	msg.msg_iov->iov_base = recvBuffer;
      	msg.msg_iov->iov_len = strlen(recvBuffer) + 1;
      	msg.msg_control = 0;
      	msg.msg_controllen = 0;
      	msg.msg_flags = 0;
      
      	if (sendmsg(sock_fd, &msg, 0) == -1) {
      		perror("sendto() error\n");
      		close(sock_fd);
      		exit(1);
      	}
      
      	printf("client send data:%s\n", recvBuffer);
      
      	memset(recvBuffer, '\0', 128);
      
      	msg.msg_name = &toAddr;
      	msg.msg_namelen = sizeof(toAddr);
      	msg.msg_iov = &iov;
      	msg.msg_iovlen = 1;
      	msg.msg_iov->iov_base = recvBuffer;
      	msg.msg_iov->iov_len = 128;
      	msg.msg_control = 0;
      	msg.msg_controllen = 0;
      	msg.msg_flags = 0;
      	if (recvmsg(sock_fd, &msg, 0) == -1) {
      		perror("recvmsg() error\n");
      		close(sock_fd);
      		exit(1);
      	}
      
      	printf("receive data from server:%s\n", recvBuffer);
      
      	close(sock_fd);
      
      	return 0;
      }
      
      /***************** rds_server.c ********************/
      
      int main(void)
      {
      	struct sockaddr_in fromAddr;
      	int sock_fd;
      	struct sockaddr_in serverAddr;
      	unsigned int addrLen;
      	char recvBuffer[128];
      	struct msghdr msg;
      	struct iovec iov;
      
      	sock_fd = socket(AF_RDS, SOCK_SEQPACKET, 0);
      	if(sock_fd < 0) {
      		perror("create socket error\n");
      		exit(0);
      	}
      
      	memset(&serverAddr, 0, sizeof(serverAddr));
      	serverAddr.sin_family = AF_INET;
      	serverAddr.sin_addr.s_addr = inet_addr("127.0.0.1");
      	serverAddr.sin_port = htons(4000);
      	if (bind(sock_fd, (struct sockaddr*)&serverAddr, sizeof(serverAddr)) < 0) {
      		perror("bind error\n");
      		close(sock_fd);
      		exit(1);
      	}
      
      	printf("server is waiting to receive data...\n");
      	msg.msg_name = &fromAddr;
      
      	/*
      	 * I add 16 to sizeof(fromAddr), ie 32,
      	 * and pay attention to the definition of fromAddr,
      	 * recvmsg() will overwrite sock_fd,
      	 * since kernel will copy 32 bytes to userspace.
      	 *
      	 * If you just use sizeof(fromAddr), it works fine.
      	 * */
      	msg.msg_namelen = sizeof(fromAddr) + 16;
      	/* msg.msg_namelen = sizeof(fromAddr); */
      	msg.msg_iov = &iov;
      	msg.msg_iovlen = 1;
      	msg.msg_iov->iov_base = recvBuffer;
      	msg.msg_iov->iov_len = 128;
      	msg.msg_control = 0;
      	msg.msg_controllen = 0;
      	msg.msg_flags = 0;
      
      	while (1) {
      		printf("old socket fd=%d\n", sock_fd);
      		if (recvmsg(sock_fd, &msg, 0) == -1) {
      			perror("recvmsg() error\n");
      			close(sock_fd);
      			exit(1);
      		}
      		printf("server received data from client:%s\n", recvBuffer);
      		printf("msg.msg_namelen=%d\n", msg.msg_namelen);
      		printf("new socket fd=%d\n", sock_fd);
      		strcat(recvBuffer, "--data from server");
      		if (sendmsg(sock_fd, &msg, 0) == -1) {
      			perror("sendmsg()\n");
      			close(sock_fd);
      			exit(1);
      		}
      	}
      
      	close(sock_fd);
      	return 0;
      }
      Signed-off-by: NWeiping Pan <wpan@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      06b6a1cf
  5. 11 7月, 2012 1 次提交
  6. 30 5月, 2012 1 次提交
  7. 22 4月, 2012 1 次提交
  8. 21 4月, 2012 2 次提交
  9. 23 3月, 2012 1 次提交
  10. 21 3月, 2012 1 次提交
  11. 20 3月, 2012 1 次提交
  12. 10 2月, 2012 1 次提交
  13. 25 1月, 2012 1 次提交
  14. 13 1月, 2012 1 次提交
  15. 14 11月, 2011 1 次提交
    • P
      rds: drop "select LLIST" · 77c1c7c4
      Paul Bolle 提交于
      Commit 1bc144b6 ("net, rds, Replace xlist in net/rds/xlist.h with
      llist") added "select LLIST" to the RDS_RDMA Kconfig entry. But there is
      no Kconfig symbol named LLIST. The select statement for that symbol is a
      nop. Drop it.
      
      lib/llist.o is builtin, so all that's needed to use the llist
      functionality is to include linux/llist.h, which this commit also did.
      Signed-off-by: NPaul Bolle <pebolle@tiscali.nl>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      77c1c7c4
  16. 01 11月, 2011 4 次提交
  17. 30 9月, 2011 1 次提交
  18. 16 9月, 2011 1 次提交
  19. 15 9月, 2011 1 次提交
  20. 26 7月, 2011 1 次提交
    • A
      notifiers: cpu: move cpu notifiers into cpu.h · 80f1ff97
      Amerigo Wang 提交于
      We presently define all kinds of notifiers in notifier.h.  This is not
      necessary at all, since different subsystems use different notifiers, they
      are almost non-related with each other.
      
      This can also save much build time.  Suppose I add a new netdevice event,
      really I don't have to recompile all the source, just network related.
      Without this patch, all the source will be recompiled.
      
      I move the notify events near to their subsystem notifier registers, so
      that they can be found more easily.
      
      This patch:
      
      It is not necessary to share the same notifier.h.
      Signed-off-by: NWANG Cong <amwang@redhat.com>
      Cc: David Miller <davem@davemloft.net>
      Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
      Cc: Greg KH <greg@kroah.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      80f1ff97
  21. 02 7月, 2011 1 次提交
  22. 17 6月, 2011 1 次提交
  23. 07 6月, 2011 1 次提交
  24. 26 5月, 2011 1 次提交
    • S
      RDMA/cma: Pass QP type into rdma_create_id() · b26f9b99
      Sean Hefty 提交于
      The RDMA CM currently infers the QP type from the port space selected
      by the user.  In the future (eg with RDMA_PS_IB or XRC), there may not
      be a 1-1 correspondence between port space and QP type.  For netlink
      export of RDMA CM state, we want to export the QP type to userspace,
      so it is cleaner to explicitly associate a QP type to an ID.
      
      Modify rdma_create_id() to allow the user to specify the QP type, and
      use it to make our selections of datagram versus connected mode.
      Signed-off-by: NSean Hefty <sean.hefty@intel.com>
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      b26f9b99
  25. 31 3月, 2011 1 次提交
  26. 24 3月, 2011 2 次提交
  27. 09 3月, 2011 1 次提交
    • N
      rds: prevent BUG_ON triggering on congestion map updates · 6094628b
      Neil Horman 提交于
      Recently had this bug halt reported to me:
      
      kernel BUG at net/rds/send.c:329!
      Oops: Exception in kernel mode, sig: 5 [#1]
      SMP NR_CPUS=1024 NUMA pSeries
      Modules linked in: rds sunrpc ipv6 dm_mirror dm_region_hash dm_log ibmveth sg
      ext4 jbd2 mbcache sd_mod crc_t10dif ibmvscsic scsi_transport_srp scsi_tgt
      dm_mod [last unloaded: scsi_wait_scan]
      NIP: d000000003ca68f4 LR: d000000003ca67fc CTR: d000000003ca8770
      REGS: c000000175cab980 TRAP: 0700   Not tainted  (2.6.32-118.el6.ppc64)
      MSR: 8000000000029032 <EE,ME,CE,IR,DR>  CR: 44000022  XER: 00000000
      TASK = c00000017586ec90[1896] 'krdsd' THREAD: c000000175ca8000 CPU: 0
      GPR00: 0000000000000150 c000000175cabc00 d000000003cb7340 0000000000002030
      GPR04: ffffffffffffffff 0000000000000030 0000000000000000 0000000000000030
      GPR08: 0000000000000001 0000000000000001 c0000001756b1e30 0000000000010000
      GPR12: d000000003caac90 c000000000fa2500 c0000001742b2858 c0000001742b2a00
      GPR16: c0000001742b2a08 c0000001742b2820 0000000000000001 0000000000000001
      GPR20: 0000000000000040 c0000001742b2814 c000000175cabc70 0800000000000000
      GPR24: 0000000000000004 0200000000000000 0000000000000000 c0000001742b2860
      GPR28: 0000000000000000 c0000001756b1c80 d000000003cb68e8 c0000001742b27b8
      NIP [d000000003ca68f4] .rds_send_xmit+0x4c4/0x8a0 [rds]
      LR [d000000003ca67fc] .rds_send_xmit+0x3cc/0x8a0 [rds]
      Call Trace:
      [c000000175cabc00] [d000000003ca67fc] .rds_send_xmit+0x3cc/0x8a0 [rds]
      (unreliable)
      [c000000175cabd30] [d000000003ca7e64] .rds_send_worker+0x54/0x100 [rds]
      [c000000175cabdb0] [c0000000000b475c] .worker_thread+0x1dc/0x3c0
      [c000000175cabed0] [c0000000000baa9c] .kthread+0xbc/0xd0
      [c000000175cabf90] [c000000000032114] .kernel_thread+0x54/0x70
      Instruction dump:
      4bfffd50 60000000 60000000 39080001 935f004c f91f0040 41820024 813d017c
      7d094a78 7d290074 7929d182 394a0020 <0b090000> 40e2ff68 4bffffa4 39200000
      Kernel panic - not syncing: Fatal exception
      Call Trace:
      [c000000175cab560] [c000000000012e04] .show_stack+0x74/0x1c0 (unreliable)
      [c000000175cab610] [c0000000005a365c] .panic+0x80/0x1b4
      [c000000175cab6a0] [c00000000002fbcc] .die+0x21c/0x2a0
      [c000000175cab750] [c000000000030000] ._exception+0x110/0x220
      [c000000175cab910] [c000000000004b9c] program_check_common+0x11c/0x180
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6094628b
  28. 01 2月, 2011 1 次提交
    • T
      rds/ib: use system_wq instead of rds_ib_fmr_wq · c534a107
      Tejun Heo 提交于
      With cmwq, there's no reason to use dedicated rds_ib_fmr_wq - it's not
      in the memory reclaim path and the maximum number of concurrent work
      items is bound by the number of devices.  Drop it and use system_wq
      instead.  This rds_ib_fmr_init/exit() noops.  Both removed.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Andy Grover <andy.grover@oracle.com>
      c534a107
  29. 20 1月, 2011 1 次提交
  30. 23 11月, 2010 1 次提交
  31. 18 11月, 2010 1 次提交
    • D
      rds: Integer overflow in RDS cmsg handling · 218854af
      Dan Rosenberg 提交于
      In rds_cmsg_rdma_args(), the user-provided args->nr_local value is
      restricted to less than UINT_MAX.  This seems to need a tighter upper
      bound, since the calculation of total iov_size can overflow, resulting
      in a small sock_kmalloc() allocation.  This would probably just result
      in walking off the heap and crashing when calling rds_rdma_pages() with
      a high count value.  If it somehow doesn't crash here, then memory
      corruption could occur soon after.
      Signed-off-by: NDan Rosenberg <drosenberg@vsecurity.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      218854af
  32. 09 11月, 2010 1 次提交
  33. 04 11月, 2010 2 次提交
  34. 31 10月, 2010 1 次提交
    • A
      RDS: Let rds_message_alloc_sgs() return NULL · d139ff09
      Andy Grover 提交于
      Even with the previous fix, we still are reading the iovecs once
      to determine SGs needed, and then again later on. Preallocating
      space for sg lists as part of rds_message seemed like a good idea
      but it might be better to not do this. While working to redo that
      code, this patch attempts to protect against userspace rewriting
      the rds_iovec array between the first and second accesses.
      
      The consequences of this would be either a too-small or too-large
      sg list array. Too large is not an issue. This patch changes all
      callers of message_alloc_sgs to handle running out of preallocated
      sgs, and fail gracefully.
      Signed-off-by: NAndy Grover <andy.grover@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d139ff09