DataSources

Data Sources are helpers to define paddle training data or testing data.

paddle.trainer_config_helpers.data_sources.define_py_data_sources2(train_list, test_list, module, obj, args=None)

Define python Train/Test data sources in one method. If train/test use the same Data Provider configuration, module/obj/args contain one argument, otherwise contain a list or tuple of arguments. For example:

define_py_data_sources2(train_list="train.list",
                        test_list="test.list",
                        module="data_provider"
                        # if train/test use different configurations,
                        # obj=["process_train", "process_test"]
                        obj="process",
                        args={"dictionary": dict_name})

The related data provider can refer to here.

Parameters:
  • train_list (basestring) – Train list name.
  • test_list (basestring) – Test list name.
  • module (basestring or tuple or list) – python module name. If train and test is different, then pass a tuple or list to this argument.
  • obj (basestring or tuple or list) – python object name. May be a function name if using PyDataProviderWrapper. If train and test is different, then pass a tuple or list to this argument.
  • args (string or picklable object or list or tuple.) – The best practice is using dict() to pass arguments into DataProvider, and use @init_hook_wrapper to receive arguments. If train and test is different, then pass a tuple or list to this argument.
Returns:

None

Return type:

None