提交 77ef7e01 编写于 作者: K kohsuke

A trouble-shooting on one of the production Hudson servers resulted in a...

A trouble-shooting on one of the production Hudson servers resulted in a thread infinitely hanging. Making the code defensive to avoid this problem.

"pool-10-thread-6 / waiting for hudson.remoting.Channel@39a5b0:winxpie7.sfbay.sun.com" Id=579896 WAITING on hudson.remoting.UserRequest@3b4670
	at java.lang.Object.wait(Native Method)
	-  waiting on hudson.remoting.UserRequest@3b4670
	at java.lang.Object.wait(Object.java:485)
	at hudson.remoting.Request.call(Request.java:116)
	at hudson.remoting.Channel.call(Channel.java:536)
	at hudson.slaves.SlaveComputer.setChannel(SlaveComputer.java:305)
	at hudson.plugins.sshslaves.SSHLauncher.startSlave(SSHLauncher.java:220)
	at hudson.plugins.sshslaves.SSHLauncher.launch(SSHLauncher.java:180)
	-  locked hudson.plugins.sshslaves.SSHLauncher@985bc4
	at hudson.slaves.SlaveComputer$1.call(SlaveComputer.java:175)
	at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
	at java.util.concurrent.FutureTask.run(FutureTask.java:138)
	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
	at java.lang.Thread.run(Thread.java:619)

	Number of locked synchronizers = 1
	- java.util.concurrent.locks.ReentrantLock$NonfairSync@14f0686


git-svn-id: https://hudson.dev.java.net/svn/hudson/trunk/hudson/main@21665 71c3de6d-444a-0410-be80-ed276b4c234a
上级 8a73c250
......@@ -631,6 +631,14 @@ public class Channel implements VirtualChannel, IChannel {
wait();
}
/**
* If the receiving end of the channel is closed (that is, if we are guaranteed to receive nothing further),
* this method returns true.
*/
/*package*/ boolean isInClosed() {
return inClosed;
}
/**
* Waits for this {@link Channel} to be closed down, but only up the given milliseconds.
*
......
......@@ -112,8 +112,15 @@ abstract class Request<RSP extends Serializable,EXC extends Throwable> extends C
final String name = t.getName();
try {
t.setName(name+" / waiting for "+channel);
while(response==null)
wait(); // wait until the response arrives
while(response==null && !channel.isInClosed())
// I don't know exactly when this can happen, as pendingCalls are cleaned up by Channel,
// but in production I've observed that in rare occasion it can block forever, even after a channel
// is gone. So be defensive against that.
wait(30*1000); // wait until the response arrives
if (response==null)
// channel is closed and we still don't have a response
throw new RequestAbortedException(null);
} catch (InterruptedException e) {
// if we are cancelled, abort the remote computation, too
channel.send(new Cancel(id));
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册