paddle 提交集群训练报socket error [ errno 104 ] Connection reset by peer
Created by: zhouksh
connect to receiver: [reciever-host]:8090 compressing thirdparty files finished to pack request starting to submit to server Traceback (most recent call last): File "/home/zhoukunsheng/paddle/paddle_mpi/submit.py", line 303, in no_prefix_train_args_dict, ).run() File "/home/zhoukunsheng/paddle/paddle_mpi/submit.py", line 77, in run self._do_poster_request() File "/home/zhoukunsheng/paddle/paddle_mpi/submit.py", line 168, in _do_poster_request connection = urllib2.urlopen(request) File "/home/zhoukunsheng/software/anaconda2/lib/python2.7/urllib2.py", line 154, in urlopen return opener.open(url, data, timeout) File "/home/zhoukunsheng/software/anaconda2/lib/python2.7/urllib2.py", line 429, in open response = self._open(req, data) File "/home/zhoukunsheng/software/anaconda2/lib/python2.7/urllib2.py", line 447, in _open '_open', req) File "/home/zhoukunsheng/software/anaconda2/lib/python2.7/urllib2.py", line 407, in _call_chain result = func(*args) File "/home/zhoukunsheng/software/anaconda2/lib/python2.7/site-packages/poster/streaminghttp.py", line 142, in http_open return self.do_open(StreamingHTTPConnection, req) File "/home/zhoukunsheng/software/anaconda2/lib/python2.7/urllib2.py", line 1201, in do_open r = h.getresponse(buffering=True) File "/home/zhoukunsheng/software/anaconda2/lib/python2.7/httplib.py", line 1121, in getresponse response.begin() File "/home/zhoukunsheng/software/anaconda2/lib/python2.7/httplib.py", line 438, in begin version, status, reason = self._read_status() File "/home/zhoukunsheng/software/anaconda2/lib/python2.7/httplib.py", line 394, in _read_status line = self.fp.readline(_MAXLINE + 1) File "/home/zhoukunsheng/software/anaconda2/lib/python2.7/socket.py", line 480, in readline data = self._sock.recv(self._rbufsize) socket.error: [Errno 104] Connection reset by peer