FLAGS_conv_workspace_size_limit=1024 set the workspace limit size for choosing cuDNN convolution algorithms to 1024MB.
cudnn_batchnorm_spatial_persistent
FLAGS_cudnn_batchnorm_spatial_persistent
*******************************************
(since 1.4.0)
...
...
@@ -37,7 +37,7 @@ Note
This mode can be faster in some tasks because an optimized path will be selected for CUDNN_DATA_FLOAT and CUDNN_DATA_HALF data types. The reason we set it to False by default is that this mode may use scaled atomic integer reduction which may cause a numerical overflow for some input data range.
cudnn_deterministic
FLAGS_cudnn_deterministic
*******************************************
(since 0.13.0)
...
...
@@ -56,7 +56,7 @@ Note
Now this flag is enabled in cuDNN convolution and pooling operator. The deterministic algorithms may slower, so this flag is generally used for debugging.
This flag is only for developer of paddlepaddle, user should not set it.
communicator_independent_recv_thread
FLAGS_communicator_independent_recv_thread
**************************************
(since 1.5.0)
...
...
@@ -40,7 +40,7 @@ Note
This flag is for developer to debug and optimize the framework. User should not set it.
communicator_max_merge_var_num
FLAGS_communicator_max_merge_var_num
**************************************
(since 1.5.0)
...
...
@@ -59,7 +59,7 @@ Note
This flag has strong relationship with trainer thread num. The default value should be the same with thread num.
communicator_merge_sparse_grad
FLAGS_communicator_merge_sparse_grad
*******************************
(since 1.5.0)
...
...
@@ -78,11 +78,11 @@ Note
Merging sparse gradient would be time-consuming. If the sparse gradient has many duplicated ids, it will save memory and communication could be much faster. Otherwise it will not save memory.
communicator_min_send_grad_num_before_recv
FLAGS_communicator_min_send_grad_num_before_recv
*******************************************
(since 1.5.0)
In communicator, there is one send thread that send gradient to parameter server and one receive thread that receive parameter from parameter server. They work independently. This flag is used to control the frequency of receive thread. Only when the send thread send at least communicator_min_send_grad_num_before_recv gradients will the receive thread receive parameter from parameter server.
In communicator, there is one send thread that send gradient to parameter server and one receive thread that receive parameter from parameter server. They work independently. This flag is used to control the frequency of receive thread. Only when the send thread send at least FLAGS_communicator_min_send_grad_num_before_recv gradients will the receive thread receive parameter from parameter server.
Values accepted
---------------
...
...
@@ -97,7 +97,7 @@ Note
This flag has strong relation with the training threads of trainer. because each training thread will send it's grad. So the default value should be training thread num.
communicator_send_queue_size
FLAGS_communicator_send_queue_size
*******************************************
(since 1.5.0)
...
...
@@ -116,7 +116,7 @@ Note
This flag will affect the training speed, if the queue size is larger, the speed may be faster, but may make the result worse.
communicator_send_wait_times
FLAGS_communicator_send_wait_times
*******************************************
(since 1.5.0)
...
...
@@ -131,7 +131,7 @@ Example
FLAGS_communicator_send_wait_times=5 set the times that send thread will wait if merge number does not reach max_merge_var_num to 5.
communicator_thread_pool_size
FLAGS_communicator_thread_pool_size
*******************************************
(since 1.5.0)
...
...
@@ -150,7 +150,7 @@ Note
Most of time user does not need to set this flag.
dist_threadpool_size
FLAGS_dist_threadpool_size
*******************************************
(Since 1.0.0)
...
...
@@ -165,7 +165,7 @@ Example
FLAGS_dist_threadpool_size=10 will enable 10 threads as max number of thread used for distributed module.
rpc_deadline
FLAGS_rpc_deadline
*******************************************
(Since 1.0.0)
...
...
@@ -180,11 +180,11 @@ Example
FLAGS_rpc_deadline=180000 will set deadline timeout to 3 minute.
rpc_disable_reuse_port
FLAGS_rpc_disable_reuse_port
*******************************************
(since 1.2.0)
When rpc_disable_reuse_port is true, the flag of grpc GRPC_ARG_ALLOW_REUSEPORT will be set to false to
When FLAGS_rpc_disable_reuse_port is true, the flag of grpc GRPC_ARG_ALLOW_REUSEPORT will be set to false to
disable the use of SO_REUSEPORT if it's available.
Values accepted
...
...
@@ -196,7 +196,7 @@ Example
FLAGS_rpc_disable_reuse_port=True will disable the use of SO_REUSEPORT.
rpc_get_thread_num
FLAGS_rpc_get_thread_num
*******************************************
(Since 1.0.0)
...
...
@@ -211,7 +211,7 @@ Example
FLAGS_rpc_get_thread_num=6 will use 6 threads to get parameter from parameter server.
rpc_send_thread_num
FLAGS_rpc_send_thread_num
*******************************************
(Since 1.0.0)
...
...
@@ -226,11 +226,11 @@ Example
FLAGS_rpc_send_thread_num=6 will set number thread used for send to 6.
rpc_server_profile_path
FLAGS_rpc_server_profile_path
*******************************************
since(v0.15.0)
Set the profiler output log file path prefix. The complete path will be rpc_server_profile_path_listener_id, listener_id is a random number.
Set the profiler output log file path prefix. The complete path will be FLAGS_rpc_server_profile_path_listener_id, listener_id is a random number.
@@ -21,7 +21,7 @@ FLAGS_allocator_strategy=naive_best_fit would use the new-designed allocator.
eager_delete_scope
FLAGS_eager_delete_scope
*******************************************
(since 0.12.0)
...
...
@@ -36,7 +36,7 @@ Example
FLAGS_eager_delete_scope=True will make scope delete synchronously.
eager_delete_tensor_gb
FLAGS_eager_delete_tensor_gb
*******************************************
(since 1.0.0)
...
...
@@ -60,7 +60,7 @@ It is recommended that users enable garbage collection strategy by setting FLAGS
enable_inplace_whitelist
FLAGS_enable_inplace_whitelist
*******************************************
(since 1.4)
...
...
@@ -76,7 +76,7 @@ FLAGS_enable_inplace_whitelist=True would disable memory in-place optimization o
fast_eager_deletion_mode
FLAGS_fast_eager_deletion_mode
*******************************************
(since 1.3)
...
...
@@ -93,7 +93,7 @@ FLAGS_fast_eager_deletion_mode=True would turn on fast garbage collection strate
FLAGS_fast_eager_deletion_mode=False would turn off fast garbage collection strategy.
fraction_of_gpu_memory_to_use
FLAGS_fraction_of_gpu_memory_to_use
*******************************************
(since 1.2.0)
...
...
@@ -113,7 +113,7 @@ Windows series platform will set FLAGS_fraction_of_gpu_memory_to_use to 0.5 by d
Linux will set FLAGS_fraction_of_gpu_memory_to_use to 0.92 by default.
free_idle_memory
FLAGS_free_idle_memory
*******************************************
(since 0.15.0)
...
...
@@ -130,7 +130,7 @@ FLAGS_free_idle_memory=True will free idle memory when there is too much of it.
FLAGS_free_idle_memory=False will not free idle memory.
fuse_parameter_groups_size
FLAGS_fuse_parameter_groups_size
*******************************************
(since 1.4.0)
...
...
@@ -146,7 +146,7 @@ FLAGS_fuse_parameter_groups_size=3 will set the size of one group parameters' gr
fuse_parameter_memory_size
FLAGS_fuse_parameter_memory_size
*******************************************
(since 1.5.0)
...
...
@@ -161,7 +161,7 @@ Example
FLAGS_fuse_parameter_memory_size=16 set the up limited memory size of one group parameters' gradient to 16 Megabytes.
init_allocated_mem
FLAGS_init_allocated_mem
*******************************************
(since 0.15.0)
...
...
@@ -178,7 +178,7 @@ FLAGS_init_allocated_mem=True will make the allocated memory initialize as a non
FLAGS_init_allocated_mem=False will not initialize the allocated memory.
initial_cpu_memory_in_mb
FLAGS_initial_cpu_memory_in_mb
*******************************************
(since 0.14.0)
...
...
@@ -193,7 +193,7 @@ Example
FLAGS_initial_cpu_memory_in_mb=100, if FLAGS_fraction_of_cpu_memory_to_use*(total physical memory) > 100MB, then allocator will pre-allocate 100MB when first allocation request raises, and re-allocate 100MB again when the pre-allocated memory is exhaustive.
initial_gpu_memory_in_mb
FLAGS_initial_gpu_memory_in_mb
*******************************************
(since 1.4.0)
...
...
@@ -213,7 +213,7 @@ If you set this flag, the memory size set by FLAGS_fraction_of_gpu_memory_to_use
If you don't set this flag, PaddlePaddle will use FLAGS_fraction_of_gpu_memory_to_use to allocate gpu memory.
limit_of_tmp_allocation
FLAGS_limit_of_tmp_allocation
*******************************************
(since 1.3)
...
...
@@ -228,7 +228,7 @@ Example
FLAGS_limit_of_tmp_allocation=1024 will set the up limit of temporary_allocation size to 1024 bytes.
memory_fraction_of_eager_deletion
FLAGS_memory_fraction_of_eager_deletion
*******************************************
(since 1.4)
...
...
@@ -248,7 +248,7 @@ FLAGS_memory_fraction_of_eager_deletion=1 would release all temporary variables.
FLAGS_memory_fraction_of_eager_deletion=0.5 would only release 50% of variables with largest memory size.
reallocate_gpu_memory_in_mb
FLAGS_reallocate_gpu_memory_in_mb
*******************************************
(since 1.4.0)
...
...
@@ -268,12 +268,12 @@ If this flag is set, PaddlePaddle will reallocate the gpu memory with size speci
Else PaddlePaddle will reallocate with size set by FLAGS_fraction_of_gpu_memory_to_use.
times_excess_than_required_tmp_allocation
FLAGS_times_excess_than_required_tmp_allocation
*******************************************
(since 1.3)
The FLAGS_times_excess_than_required_tmp_allocation indicates the max size the TemporaryAllocator can return. For Example
, if the required memory size is N, and times_excess_than_required_tmp_allocation is 2.0, the TemporaryAllocator will return the available allocation that the range of size is N ~ 2*N.
, if the required memory size is N, and FLAGS_times_excess_than_required_tmp_allocation is 2.0, the TemporaryAllocator will return the available allocation that the range of size is N ~ 2*N.
Values accepted
---------------
...
...
@@ -284,7 +284,7 @@ Example
FLAGS_times_excess_than_required_tmp_allocation=1024 will set the max size of the TemporaryAllocator can return to 1024*N.