提交 · 6679bc9474d8ba01dbe19a28a553aef308371c57 · weixin_51669992 / tensorflow

01 11月, 2020 1 次提交

由 Jiri Simsa 提交于 10月 31, 2020

PiperOrigin-RevId: 340071266
Change-Id: Ic21209a25a1f8efa1122c9cee4a8ab3b8043c308

6679bc94

25 9月, 2020 1 次提交
- J
  [tf.data] Cleaning up mobile build target. · a6fa2788
  由 Jiri Simsa 提交于 9月 24, 2020
```
PiperOrigin-RevId: 333597626
Change-Id: I63be2d4b736d7d1423c93999a0ac2184257eadb4
```
  a6fa2788
17 9月, 2020 1 次提交

[tf.data] Add dataset splitting mechanism. · 5703a4ee

由 Andrew Audibert 提交于 9月 16, 2020

This CL introduces the concept of a SplitProvider. A SplitProvider produces a sequence of "split" tensors which are interpreted by source datasets to produce dataset elements.

When we initialize an iterator, a SplitProvider can be passed through the IteratorContext to indicate that the iterator should only iterate through the splits provided by the SplitProvider.

This CL adds an optional DatasetBase::MakeSplitIterator method which creates a SplitIterator to create splits for the dataset. For non-source datasets, the proper implementation is generally just to call MakeSplitIterator on their input. To support this reasonable default, we add a `DatasetBase::InputDatasets` method, which produces the input datasets for a dataset. If a dataset implements InputDatasets and has a single input dataset, MakeSplitIterator will call delegate to the input by default.

This CL only implements splitting for range_dataset_op; other splitting implementation will come in later CLs. This CL also implements a `ShardingSplitProvider`, to better test the range_dataset_op splitting implementation. `ShardingSplitProvider` will be useful in its own right for implementing an alternative to AutoShard which leverages splitting.

PiperOrigin-RevId: 332056019
Change-Id: I73b9b03cb91ae689c57a72fa6ba0acd092cf4cbe

5703a4ee

15 9月, 2020 1 次提交

[tf.data] Fix the problem for the current `AsGraphDefInternal` function in... · 0e2e8460

由 Jay Shi 提交于 9月 14, 2020

[tf.data] Fix the problem for the current `AsGraphDefInternal` function in `ModelDatasetOp` is not capturing all attrs.

PiperOrigin-RevId: 331643560
Change-Id: I221b7a5510ee78ffb7ccfbed051d398ca056f4eb

0e2e8460

03 9月, 2020 1 次提交
- J
  [tf.data] Give users the option to input RAM budget in autotuning. · 67dfaf3a
  由 Jay Shi 提交于 9月 02, 2020
```
PiperOrigin-RevId: 329784678
Change-Id: I3ad64c99aae42906f1ed96c46dc0885b518978fb
```
  67dfaf3a
29 7月, 2020 1 次提交

[tf.data] Reverting portion of cl/322686110, changing back model input time to... · f72b707d

由 Jiri Simsa 提交于 7月 28, 2020

[tf.data] Reverting portion of cl/322686110, changing back model input time to be zero (i.e. infinitely fast consumer).

PiperOrigin-RevId: 323628170
Change-Id: Ia6dd4fe48985c431b84c6189e5622e29fde4f8f9

f72b707d

23 7月, 2020 1 次提交

[tf.data] Calculate the average input time for the root node of the data input... · 969fdb05

由 Jay Shi 提交于 7月 22, 2020

[tf.data] Calculate the average input time for the root node of the data input pipeline. Also add a unit test for the `SelfProcessingTime` function.

PiperOrigin-RevId: 322686110
Change-Id: I72c0bddb2ed32570325c7d8023aaaffbeab87378

969fdb05

07 4月, 2020 1 次提交

[tf.data] Adding a metric for bytes produced and consumed by individual... · eabc157f

由 Jiri Simsa 提交于 4月 06, 2020

[tf.data] Adding a metric for bytes produced and consumed by individual transformations, refactoring infrastructure for recording tf.data metrics, and moving the metrics API and implementation from `common_runtime` to `framework`.

PiperOrigin-RevId: 305062865
Change-Id: I63911f00154baf36aa225f66dbef0843239b7392

eabc157f

18 3月, 2020 1 次提交
- J
  [tf.data] Adding support for overriding external state policy for checkpointing. · 3a15248d
  由 Jiri Simsa 提交于 3月 17, 2020
```
PiperOrigin-RevId: 301443563
Change-Id: I852269b86039a71466ddeadfe3ce03d75dc45fda
```
  3a15248d
06 3月, 2020 1 次提交
- A
  Prefixing TensorFlow thread annotation macros with TF_. · 83d65b15
  由 A. Unique TensorFlower 提交于 3月 05, 2020
```
PiperOrigin-RevId: 299110761
Change-Id: I66ecaa9d01dc441f091888bef3f24d220e9180c5
```
  83d65b15
07 2月, 2020 1 次提交

[tf.data] Adding information necessary for reconstructing the input pipeline... · 98105151

由 Jiri Simsa 提交于 2月 06, 2020

[tf.data] Adding information necessary for reconstructing the input pipeline graph from TraceMe metadata.

PiperOrigin-RevId: 293728779
Change-Id: Ib06750c4c360db603c00eb2133ee936c243fdf88

98105151

26 11月, 2019 1 次提交
- D
  Update tf.data to use static `EnvTime::Now*()` methods instead of virtual `Env::Now*()` methods. · a4557d1c
  由 Derek Murray 提交于 11月 25, 2019
```
PiperOrigin-RevId: 282386008
Change-Id: I14c1ac544da14e536855e85a455b6cb4ed885467
```
  a4557d1c
17 8月, 2019 1 次提交

[tf.data] Adds an upper bound for the total buffer limit of the model in... · b27309e8

由 Ihor Indyk 提交于 8月 16, 2019

[tf.data] Adds an upper bound for the total buffer limit of the model in `Model::Optimize` as % of available RAM.

PiperOrigin-RevId: 263789432

b27309e8

08 8月, 2019 1 次提交

[tf.data] Serialization and checkpointing related cleanup. · 6d8f05ac

由 Jiri Simsa 提交于 8月 07, 2019

This CL:
- removes unused `DatasetBase::Save()` and related tests
- replaces `SerilizationContext::optimization_only` with multiple functionality specific flags (`check_external_state`, `fail_if_unimplemented`, and `serialize_data_tensors`)
- introduces `DatasetBase::CheckExternalState` as an error-raising replacement for `DatasetBase::IsStateful` to make it possible to communicate the reason for why serialization failed through the error status
- adds `IteratorBase::SaveInternal` and `IteratorBase::RestoreInternal` in preparation of making these methods pure virtual

PiperOrigin-RevId: 262235093

6d8f05ac

26 7月, 2019 1 次提交

[tf.data] Changing the implementation of iterator checkpointing to not store the dataset graph. · 1f878734

由 Jiri Simsa 提交于 7月 26, 2019

After this change, restoring an iterator from a checkpoint will require that the iterator is initialized using a dataset that matches the dataset used for initializing the iterator used to create the checkpoint. In other words, if the Python definition of the input pipeline changes, the restoration of the iterator will fail.

The motivation for this change is to make it possible to save (and restore) datasets whose graph cannot be serialized (e.g. because it contains ops with resource inputs). This will in turn allow tf.data to implement "reshuffle each iteration" or in-memory caching between different Python iterator for the same dataset.

PiperOrigin-RevId: 260144783

1f878734

25 6月, 2019 1 次提交
- J
  [tf.data] Introducing an option to control which algorithm is used for autotuning. · 1ad83b29
  由 Jiri Simsa 提交于 6月 24, 2019
```
PiperOrigin-RevId: 254812919
```
  1ad83b29
04 4月, 2019 1 次提交

[tf.data] Adjusting auto-tuning period to 1 minute (from previous incorrect... · bf3bd1c0

由 Jiri Simsa 提交于 4月 03, 2019

[tf.data] Adjusting auto-tuning period to 1 minute (from previous incorrect value of 1000 minutes) and improving auto-tuning logging.

PiperOrigin-RevId: 241826050

bf3bd1c0

14 3月, 2019 1 次提交

[tf.data] Exposing an option for specifying the CPU budget for autotuning... · 7dcaaecc

由 Jiri Simsa 提交于 3月 13, 2019

[tf.data] Exposing an option for specifying the CPU budget for autotuning parallelism and nesting the autotuning-related options under experimental_optimization.

PiperOrigin-RevId: 238369191

7dcaaecc

02 3月, 2019 1 次提交

[tf.data] Add an unbounded thread pool to iterator resources. · 70da1fe2

由 Derek Murray 提交于 3月 01, 2019

The previous implementation of many core `tf.data` transformations
(e.g. `Dataset.prefetch()`) would create one or more threads each time
an iterator over those datasets is created
(e.g. `ds.prefetch(N).repeat(100)` would create and destroy 100
threads). In addition to the overhead of thread creation, this
interacts poorly with some malloc implementations, and can contribute
to memory fragmentation.

The new implementation maintains an unbounded pool of physical threads
in each iterator (or `MultiDeviceIterator`) resource, and returns logical
"threads" to that pool when their work is complete instead of exiting
from them.

PiperOrigin-RevId: 236413014

70da1fe2

15 1月, 2019 1 次提交

[tf.data] Add counters for tf.data elements, autotuning, and optimizations. · b74605a9

由 Jiri Simsa 提交于 1月 14, 2019

This CL:
- adds counters for tf.data elements, autotuning and optimizations
- sets the number of iterations of the `tf_data_meta_optimizer` to one -- the iteration of tf.data optimizations is handled by the tf.data meta optimizer itself
- adds the `alwayslink` attribute to all tf.data optimization BUILD targets to make sure they are always registered (without this, they would not be registered for the Tensorflow server binary I was using for local testing) and further cleans up visibility and dependencies of //third_party/tensorflow/core/grappler/optimizers/data/BUILD
- introduces TFDataOptimizerBase as a base class for tf.data optimizations
- moves TensorFlow metrics into tensorflow::metrics namespace

PiperOrigin-RevId: 229302097

b74605a9

21 12月, 2018 1 次提交
- J
  [tf.data] Making it possible to override modeling framework implementations. · 10a114c0
  由 Jiri Simsa 提交于 12月 20, 2018
```
PiperOrigin-RevId: 226402626
```
  10a114c0
05 12月, 2018 2 次提交
- S
  
  Ambiq squashed commits · 9caf68cf
  由 Steve Nesae 提交于 11月 06, 2018
  
  9caf68cf
- J
  [tf.data] Adding `tf.data.experimental.cardinality()` which provides... · 98eb7d80
  由 Jiri Simsa 提交于 12月 04, 2018
```
[tf.data] Adding `tf.data.experimental.cardinality()` which provides information about dataset cardinality.

PiperOrigin-RevId: 224030418
```
  98eb7d80
09 11月, 2018 1 次提交

[datasets] Remove lock around sub-iterator access in ModelDatasetOp · 9c3729ce

由 A. Unique TensorFlower 提交于 11月 09, 2018

The wrapped iterator's GetNext method is thread-safe. Calling it with mu_ exclusively locked serializes calls to impl_->GetNext().

PiperOrigin-RevId: 220799359

9c3729ce

06 11月, 2018 1 次提交
- J
  [tf.data] Clean up of how `IteratorContext` is constructed and modified. · 059dd690
  由 Jiri Simsa 提交于 11月 05, 2018
```
PiperOrigin-RevId: 220125067
```
  059dd690
31 10月, 2018 2 次提交
- J
  [tf.data] Refactoring of performance modeling implementation and adding... · 0716c686
  由 Jiri Simsa 提交于 10月 30, 2018
```
[tf.data] Refactoring of performance modeling implementation and adding performance modeling for all core and experimental tf.data kernels.

PiperOrigin-RevId: 219406929
```
  0716c686
- J
  [tf.data] Using consistent thread names. · 9875d8e4
  由 Jiri Simsa 提交于 10月 30, 2018
```
PiperOrigin-RevId: 219390881
```
  9875d8e4
26 10月, 2018 1 次提交
- D
  [tf.data] Remove forwarding header in favor of "tensorflow/core/framework/dataset.h". · 1c7cf5e0
  由 Derek Murray 提交于 10月 25, 2018
```
PiperOrigin-RevId: 218765742
```
  1c7cf5e0
09 10月, 2018 2 次提交
- D
  Automated rollback of commit 13b47e6c · eb0f862b
  由 Derek Murray 提交于 10月 08, 2018
```
PiperOrigin-RevId: 216260575
```
  eb0f862b
- D
  Automated rollback of commit 295b3c80 · 13b47e6c
  由 Derek Murray 提交于 10月 08, 2018
```
PiperOrigin-RevId: 216247929
```
  13b47e6c
04 10月, 2018 2 次提交
- D
  Automated rollback of commit c9bdd393 · 295b3c80
  由 Derek Murray 提交于 10月 03, 2018
```
PiperOrigin-RevId: 215607038
```
  295b3c80
- D
  [tf.data] Switch background threads to use `BackgroundWorker`. · c9bdd393
  由 Derek Murray 提交于 10月 03, 2018
```
PiperOrigin-RevId: 215579950
```
  c9bdd393
21 9月, 2018 1 次提交

[tf.data] Moving auto-tuning optimizations into a background thread,... · 0e1efc3d

由 Jiri Simsa 提交于 9月 20, 2018

[tf.data] Moving auto-tuning optimizations into a background thread, refactoring the API for exposing tunable parameters, and removing `model::Node` from the public API.

PiperOrigin-RevId: 213907565

0e1efc3d

18 9月, 2018 1 次提交

[tf.data] Adding support for `tf.data.AUTOTUNE` as a special value for the... · c8a0dfc7

由 Jiri Simsa 提交于 9月 17, 2018

[tf.data] Adding support for `tf.data.AUTOTUNE` as a special value for the `num_parallel_calls` argument of `tf.data.Dataset.map()`, `tf.data.Dataset.interleave()`, and `tf.contrib.data.map_and_batch()`.

When `tf.data.AUTOTUNE` is specified, the level of parallelism is determined at runtime. The underlying mechanism instruments the input pipeline to build a performance model and then uses the model to find the optimal values for the parallelism knobs.

PiperOrigin-RevId: 213283297

c8a0dfc7

12 9月, 2018 1 次提交
- J
  [tf.data] Mechanism for collecting processing time information and modeling performance. · 683cf4eb
  由 Jiri Simsa 提交于 9月 11, 2018
```
PiperOrigin-RevId: 212557406
```
  683cf4eb
06 9月, 2018 1 次提交
- D
  [tf.data] Move all C++ code inside the `tensorflow::data` namespace. · ad5c0c4d
  由 Derek Murray 提交于 9月 05, 2018
```
PiperOrigin-RevId: 211733735
```
  ad5c0c4d
14 8月, 2018 1 次提交

[tf.data] Internal refactoring of C++ classes and APIs. · 83f1458e

由 Jiri Simsa 提交于 8月 13, 2018

- replacing `OpKernelContext` with newly introduced `DatasetContext` in `DatasetBase` constructor to make it possible to instantiate `DatasetBase` in places where an instance of `OpKernelContext` is not available

- replacing `dataset::MakeIteratorContext(OpKernelContext* ctx)` factory with `IteratorContext(OpKernelContext *ctx)` constructor.

- folding `GraphDatasetBase` into `DataseBase` and removing the default implementation of `AsGraphDefInternal`, making it the responsibility of the derived class to implement it to encourage/hint developers to provide serialization logic

PiperOrigin-RevId: 208560010

83f1458e

11 8月, 2018 2 次提交

[tf.data] Optimization checkpointing improvements. · 8d532ac4

由 Jiri Simsa 提交于 8月 10, 2018

This CL:
- changes the `OptimizeDataset` checkpointing logic to checkpoint the optimized dataset (as opposed to the original dataset + the optimizations, re-running optimization every time a checkpoint is restored)
- replaces `OpKernelContext` with newly introduced `SerializationContext` in the signature of `AsGraphDefInternal` to reduce the scope of the context and also simplify the logic for overriding the `FunctionLibraryDefinition` when optimizations take place

PiperOrigin-RevId: 208282562

8d532ac4

[tf.data] Minor API refactoring. · 0d1b1448

由 Jiri Simsa 提交于 8月 10, 2018

Renaming `AddParentDataset`, `SaveParent`, and `RestoreParent` to `AddInputDataset`, `SaveInput`, and `RestoreInput`.

PiperOrigin-RevId: 208272695

0d1b1448

01 6月, 2018 1 次提交

[tf.data] Mark DebugString() as const. · 3e3dd647

由 Brennan Saeta 提交于 5月 31, 2018

By marking DebugString() as const we can make some error messages more descriptive. Because DatasetIterator marks the return value of the dataset() function const, DebugString() cannot be called.

PiperOrigin-RevId: 198796894

3e3dd647

weixin_51669992 / tensorflow 与 Fork 源项目一致

weixin_51669992 / tensorflow
与 Fork 源项目一致