adam learning algorithm error
Created by: PseudoProgrammer
adam
learning algorithm get an error, but it works well when i set adagrad
adam
settings
Settings(
algorithm='sgd',
learning_rate=0.01,
learning_method = 'adam',
adam_beta1 = 0.9,
adam_beta2 = 0.999,
ada_epsilon = 1e-6,
ada_rou = 0.95,
batch_size = 789,
learning_rate_decay_a=0,
learning_rate_decay_b=0,
num_batches_per_send_parameter=1,
num_batches_per_get_parameter=1,
)
adagrad
settings:
Settings(
algorithm='sgd',
learning_rate=0.01,
learning_method = 'adagrad',
ada_epsilon = 1e-6,
ada_rou = 0.95,
batch_size = 789,
learning_rate_decay_a=0,
learning_rate_decay_b=0,
num_batches_per_send_parameter=1,
num_batches_per_get_parameter=1,
)
train.log:
Fri Dec 30 01:25:25 2016[1,17]<stderr>:+ ./paddle_trainer --num_gradient_servers=40 --trainer_id=17 --pservers=10.90.165.41,10.90.165.39,10.90.165.37,10.90.165.38,10.90.165.35,10.90.165.36,10.90.165.33,10.90.165.34,10.90.165.31,10.90.165.32,10.90.165.30,10.90.168.20,10.90.168.21,10.90.168.22,10.90.168.23,10.90.168.24,10.90.168.25,10.90.168.26,10.90.168.27,10.90.168.28,10.90.168.29,10.90.168.32,10.90.168.33,10.90.168.30,10.90.168.31,10.90.168.36,10.90.168.37,10.90.168.34,10.90.168.35,10.90.168.38,10.90.168.39,10.90.168.40,10.90.168.41,10.90.168.42,10.90.168.43,10.90.168.44,10.90.102.42,10.90.102.41,10.90.102.44,10.90.102.43 --rdma_tcp=tcp --nics=xgbe0 --saving_period=5 --port=7164 --ports_num=1 --local=0 --comment=_job.16597.instances --dot_period=1000 --log_period=1000 --num_passes=5000 --trainer_count=10 --load_missing_parameter_strategy=rand --config=conf/trainer_config.conf --save_dir=./output --python_path=./python-gcc345 --python_bin=python2.7 --use_gpu=0
Fri Dec 30 01:25:41 2016[1,18]<stderr>:F1230 01:25:41.543866 7156 BaseClient.cpp:25] Check failed: numPorts > 0 (0 vs. 0)
Fri Dec 30 01:25:41 2016[1,18]<stderr>:*** Check failure stack trace: ***
Fri Dec 30 01:25:41 2016[1,18]<stderr>: @ 0x8d7788 google::LogMessage::Fail()
Fri Dec 30 01:25:41 2016[1,18]<stderr>: @ 0x8d76e0 google::LogMessage::SendToLog()
Fri Dec 30 01:25:41 2016[1,18]<stderr>: @ 0x8d7175 google::LogMessage::Flush()
Fri Dec 30 01:25:41 2016[1,18]<stderr>: @ 0x8d9f36 google::LogMessageFatal::~LogMessageFatal()
Fri Dec 30 01:25:41 2016[1,18]<stderr>: @ 0x766304 paddle::BaseClient::BaseClient()
Fri Dec 30 01:25:41 2016[1,18]<stderr>: @ 0x76c909 paddle::ParameterClient2::ParameterClient2()
Fri Dec 30 01:25:41 2016[1,18]<stderr>: @ 0x75538f paddle::SparseRemoteParameterUpdater::init()
Fri Dec 30 01:25:41 2016[1,18]<stderr>: @ 0x74b437 _ZNSt6thread5_ImplISt12_Bind_resultIvFZN6paddle14SyncThreadPool5startEvEUliE_mEEE6_M_runEv
Fri Dec 30 01:25:41 2016[1,18]<stderr>: @ 0x7fcc36ea0462 execute_native_thread_routine
Fri Dec 30 01:25:41 2016[1,18]<stderr>: @ 0x7fcc37529d30 start_thread
Fri Dec 30 01:25:41 2016[1,18]<stderr>: @ 0x7fcc366f0afd clone