未验证 提交 2267c4e8 编写于 作者: J Jeff Wang 提交者: GitHub

Adjust sync cycle (#328)

* Adjust the sync_cycle automatically.

* Change the multiplier type to double.

* Adjust the sync_period.

* Fix comment.

* Add the pre-commit

* Run pre-commit with other files.
上级 b9fa3ba8
...@@ -79,7 +79,7 @@ VisualDL provides both Python SDK and C++ SDK in order to fit more use cases. ...@@ -79,7 +79,7 @@ VisualDL provides both Python SDK and C++ SDK in order to fit more use cases.
### Python SDK ### Python SDK
VisualDL now supports both Python 2 and Python 3. VisualDL now supports both Python 2 and Python 3.
Below is an example of creating a simple Scalar component and inserting data from different timestamps: Below is an example of creating a simple Scalar component and inserting data from different timestamps:
```python ```python
...@@ -162,7 +162,7 @@ pip install --upgrade dist/visualdl-*.whl ...@@ -162,7 +162,7 @@ pip install --upgrade dist/visualdl-*.whl
### Run a demo from scratch ### Run a demo from scratch
``` ```
# vdl_create_scratch_log is a helper commend that creates mock data. # vdl_create_scratch_log is a helper commend that creates mock data.
vdl_create_scratch_log vdl_create_scratch_log
visualDL --logdir=scratch_log --port=8080 visualDL --logdir=scratch_log --port=8080
``` ```
that will start a server locally on port 8080, then that will start a server locally on port 8080, then
......
...@@ -87,4 +87,4 @@ The histograms of the training parameters is as follows: ...@@ -87,4 +87,4 @@ The histograms of the training parameters is as follows:
<img width="70%" src="https://github.com/daming-lu/large_files/blob/master/keras_demo_figs/keras_histogram.png?raw=true" /> <img width="70%" src="https://github.com/daming-lu/large_files/blob/master/keras_demo_figs/keras_histogram.png?raw=true" />
</p> </p>
The full demonstration code can be downloaded in [here](https://github.com/PaddlePaddle/VisualDL/blob/develop/demo/keras/keras_mnist_demo.py). The full demonstration code can be downloaded in [here](https://github.com/PaddlePaddle/VisualDL/blob/develop/demo/keras/keras_mnist_demo.py).
\ No newline at end of file
...@@ -4,7 +4,7 @@ Here we will show you how to use VisualDL in MXNet so that you can visualize the ...@@ -4,7 +4,7 @@ Here we will show you how to use VisualDL in MXNet so that you can visualize the
We will use the MXNet Convolution Neural Network to train the [MNIST](http://yann.lecun.com/exdb/mnist/) dataset as an example. We will use the MXNet Convolution Neural Network to train the [MNIST](http://yann.lecun.com/exdb/mnist/) dataset as an example.
## Install MXNet ## Install MXNet
Please install MXNet according to MXNet's [official website](https://mxnet.incubator.apache.org/install/index.html) Please install MXNet according to MXNet's [official website](https://mxnet.incubator.apache.org/install/index.html)
and verify that the installation is successful. and verify that the installation is successful.
>>> import mxnet as mx >>> import mxnet as mx
...@@ -58,7 +58,7 @@ lenet_model.fit(train_iter, ...@@ -58,7 +58,7 @@ lenet_model.fit(train_iter,
``` ```
That's all. In the training process of MXNet, our callback function is called to record the accuracy at the end of each training batch. That's all. In the training process of MXNet, our callback function is called to record the accuracy at the end of each training batch.
The rate of accuracy will continue to rise until more than 95%. The rate of accuracy will continue to rise until more than 95%.
The following is the accuracy of the two epochs: The following is the accuracy of the two epochs:
<p align=center><img width="50%" src="https://github.com/PaddlePaddle/VisualDL/blob/develop/demo/mxnet/epoch2_small.png?raw=true" /></p> <p align=center><img width="50%" src="https://github.com/PaddlePaddle/VisualDL/blob/develop/demo/mxnet/epoch2_small.png?raw=true" /></p>
......
...@@ -167,7 +167,7 @@ for epoch in range(5): # loop over the dataset multiple times ...@@ -167,7 +167,7 @@ for epoch in range(5): # loop over the dataset multiple times
print('Finished Training') print('Finished Training')
``` ```
PyTorch support ONNX standard and it can export its model into ONNX. PyTorch support ONNX standard and it can export its model into ONNX.
PyTorch runs a single round of inference to trace the graph. We use a dummy input to run the model to produce the ONNX model PyTorch runs a single round of inference to trace the graph. We use a dummy input to run the model to produce the ONNX model
```python ```python
......
...@@ -27,10 +27,10 @@ logw = LogWriter("./random_log", sync_cycle=10000) ...@@ -27,10 +27,10 @@ logw = LogWriter("./random_log", sync_cycle=10000)
``` ```
The first parameter points to a folder; the second parameter `sync_cycle` specifies out of how memory operations should be The first parameter points to a folder; the second parameter `sync_cycle` specifies out of how memory operations should be
store the data into hard drive. store the data into hard drive.
### sync_cycle ### sync_cycle
Writing is a heavy operation. Setting `sync_cycle` might slow down your training. Writing is a heavy operation. Setting `sync_cycle` might slow down your training.
A good starting point is to set the `sync_cycle` to be at least twice the amount of data point your would like to capture. A good starting point is to set the `sync_cycle` to be at least twice the amount of data point your would like to capture.
There are different modes for model training, such as training, validating and testing. All these correspond to `mode' in VisualDL. There are different modes for model training, such as training, validating and testing. All these correspond to `mode' in VisualDL.
......
...@@ -22,6 +22,15 @@ limitations under the License. */ ...@@ -22,6 +22,15 @@ limitations under the License. */
namespace visualdl { namespace visualdl {
const int minimun_sync_cycle = 100;
// Expect sync happens every 15~25 seconds
const int sync_period = 20;
const int period_range = 5;
const double slower_multiplier = 1.4;
const double faster_multiplier = 0.5;
static time_t last_sync_time = time(NULL);
template <typename T> template <typename T>
void SimpleWriteSyncGuard<T>::Start() { void SimpleWriteSyncGuard<T>::Start() {
CHECK(data_); CHECK(data_);
...@@ -33,6 +42,23 @@ void SimpleWriteSyncGuard<T>::End() { ...@@ -33,6 +42,23 @@ void SimpleWriteSyncGuard<T>::End() {
CHECK(data_); CHECK(data_);
if (data_->parent()->meta.ToSync()) { if (data_->parent()->meta.ToSync()) {
Sync(); Sync();
time_t current_time = time(NULL);
time_t interval = current_time - last_sync_time;
// If last sync happens more than 25 seconds ago, the system needs to make
// the sync-up faster
if (interval > sync_period + period_range) {
data_->parent()->meta.cycle =
std::max(long(data_->parent()->meta.cycle * faster_multiplier),
long(minimun_sync_cycle));
} else if (interval < sync_period - period_range) {
// If the last sync happens less than 15 seconds ago, the system needs to
// make the sync-up slower.
data_->parent()->meta.cycle = std::min(
long(data_->parent()->meta.cycle * slower_multiplier), LONG_MAX);
}
last_sync_time = current_time;
} }
} }
......
...@@ -32,9 +32,7 @@ Storage::Storage(const Storage& other) ...@@ -32,9 +32,7 @@ Storage::Storage(const Storage& other)
dir_ = other.dir_; dir_ = other.dir_;
} }
Storage::~Storage() { Storage::~Storage() { PersistToDisk(); }
PersistToDisk();
}
void Storage::AddMode(const std::string& x) { void Storage::AddMode(const std::string& x) {
// avoid duplicate modes. // avoid duplicate modes.
...@@ -54,13 +52,9 @@ Tablet Storage::AddTablet(const std::string& x) { ...@@ -54,13 +52,9 @@ Tablet Storage::AddTablet(const std::string& x) {
return Tablet(&(*tablets_)[x], this); return Tablet(&(*tablets_)[x], this);
} }
void Storage::SetDir(const std::string& dir) { void Storage::SetDir(const std::string& dir) { *dir_ = dir; }
*dir_ = dir;
}
std::string Storage::dir() const { std::string Storage::dir() const { return *dir_; }
return *dir_;
}
void Storage::PersistToDisk() { PersistToDisk(*dir_); } void Storage::PersistToDisk() { PersistToDisk(*dir_); }
...@@ -70,27 +64,25 @@ void Storage::PersistToDisk(const std::string& dir) { ...@@ -70,27 +64,25 @@ void Storage::PersistToDisk(const std::string& dir) {
fs::SerializeToFile(*data_, meta_path(dir)); fs::SerializeToFile(*data_, meta_path(dir));
for (auto tag : data_->tags()) { for (auto tag : data_->tags()) {
if (modified_tablet_set_.count(tag) > 0){ if (modified_tablet_set_.count(tag) > 0) {
auto it = tablets_->find(tag); auto it = tablets_->find(tag);
CHECK(it != tablets_->end()) << "tag " << tag << " not exist."; CHECK(it != tablets_->end()) << "tag " << tag << " not exist.";
fs::SerializeToFile(it->second, tablet_path(dir, tag)); fs::SerializeToFile(it->second, tablet_path(dir, tag));
} }
} }
modified_tablet_set_.clear(); modified_tablet_set_.clear();
} }
Storage* Storage::parent() { Storage* Storage::parent() { return this; }
return this;
}
void Storage::MarkTabletModified(const std::string tag) { void Storage::MarkTabletModified(const std::string tag) {
modified_tablet_set_.insert(tag); modified_tablet_set_.insert(tag);
} }
void Storage::AddTag(const std::string& x) { void Storage::AddTag(const std::string& x) {
*data_->add_tags() = x; *data_->add_tags() = x;
WRITE_GUARD WRITE_GUARD
} }
// StorageReader // StorageReader
std::vector<std::string> StorageReader::all_tags() { std::vector<std::string> StorageReader::all_tags() {
......
...@@ -18,18 +18,18 @@ limitations under the License. */ ...@@ -18,18 +18,18 @@ limitations under the License. */
namespace visualdl { namespace visualdl {
void Tablet::SetTag(const std::string& mode, const std::string& tag) { void Tablet::SetTag(const std::string& mode, const std::string& tag) {
auto internal_tag = mode + "/" + tag; auto internal_tag = mode + "/" + tag;
string::TagEncode(internal_tag); string::TagEncode(internal_tag);
internal_encoded_tag_ = internal_tag; internal_encoded_tag_ = internal_tag;
data_->set_tag(internal_tag); data_->set_tag(internal_tag);
WRITE_GUARD WRITE_GUARD
} }
Record Tablet::AddRecord() { Record Tablet::AddRecord() {
parent()->MarkTabletModified(internal_encoded_tag_); parent()->MarkTabletModified(internal_encoded_tag_);
IncTotalRecords(); IncTotalRecords();
WRITE_GUARD WRITE_GUARD
return Record(data_->add_records(), parent()); return Record(data_->add_records(), parent());
} }
TabletReader Tablet::reader() { return TabletReader(*data_); } TabletReader Tablet::reader() { return TabletReader(*data_); }
......
...@@ -29,11 +29,12 @@ struct TabletReader; ...@@ -29,11 +29,12 @@ struct TabletReader;
* Tablet is a helper for operations on storage::Tablet. * Tablet is a helper for operations on storage::Tablet.
*/ */
struct Tablet { struct Tablet {
enum Type { kScalar = 0, kHistogram = 1, kImage = 2, kUnknown = -1}; enum Type { kScalar = 0, kHistogram = 1, kImage = 2, kUnknown = -1 };
DECL_GUARD(Tablet); DECL_GUARD(Tablet);
Tablet(storage::Tablet* x, Storage* parent) : data_(x), x_(parent), internal_encoded_tag_("") {} Tablet(storage::Tablet* x, Storage* parent)
: data_(x), x_(parent), internal_encoded_tag_("") {}
static Type type(const std::string& name) { static Type type(const std::string& name) {
if (name == "scalar") { if (name == "scalar") {
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册