提交 · 73d557cc88ee86834917088416e5ce8783d11798 · qq_38905368 / tensorflow

26 2月, 2016 9 次提交

Fix an error message in tf.sparse_to_dense to include the possibility that... · 73d557cc

由 A. Unique TensorFlower 提交于 2月 24, 2016

Fix an error message in tf.sparse_to_dense to include the possibility that indices are invalid because they are out of bounds.
Change: 115522264

73d557cc

Added TestReporter and test / benchmark reporting tools. · fcfa866d

由 Eugene Brevdo 提交于 2月 24, 2016

These tools are meant to allow recording of benchmark & unit test
structured output to pbtxt files in a directory only when the
environment variable TEST_REPORT_FILE_PREFIX is set.  For now,
only saving of C++  microbenchmark output is supported.
Change: 115518303

fcfa866d

S
Added unit test for max_to_keep being None. · 4ecd2a70
由 Sherry Moore 提交于 2月 24, 2016
```
Change: 115516426
```
4ecd2a70
K
Move all Tensorflow WORKSPACE rules to a skylark macro · 77da168d
由 Kiril Gorovoy 提交于 2月 24, 2016
```
Change: 115515678
```
77da168d
J
Remove no-longer-needed RequireDefaultOps(). · 9ba55d8a
由 Josh Levenberg 提交于 2月 24, 2016
```
Change: 115511835
```
9ba55d8a
J
Remove no-longer-needed RequireDefaultOps(). · ab286e09
由 Josh Levenberg 提交于 2月 24, 2016
```
Change: 115511794
```
ab286e09

Switch nn.moments() to using a one-pass stable algorithm. · bce62166

由 Vincent Vanhoucke 提交于 2月 24, 2016

Helps with: https://github.com/tensorflow/tensorflow/issues/917
Also fixes https://github.com/tensorflow/tensorflow/issues/1162

The main benefit is that the computation of the sufficient statistics is now decoupled of the aggregation of the moments, which means that if you want to perform the accumulation incrementally, you don't have to keep all the inputs around, and can instead keep the much more compact sum and sum-of-squares. Accumulation could also be performed locally if you aggregate across multiple devices.
Computing sum and sum-of-squares can also theoretically be performed in parallel now.

Tested running inception: same performance, same step time.
Batch normalization benchmark is a bit faster on CPU, a bit slower on GPU:

Before:
cpu shape:4/3 #layers:10 mode:py scale:True train:False - 1.139310 secs
gpu shape:4/3 #layers:10 mode:py scale:True train:False - 0.021970 secs
cpu shape:4/3 #layers:10 mode:py scale:True train:True - 2.767147 secs
gpu shape:4/3 #layers:10 mode:py scale:True train:True - 0.074531 secs
cpu shape:4/3 #layers:10 mode:py scale:True train:False - 0.742835 secs
gpu shape:4/3 #layers:10 mode:py scale:True train:False - 0.013473 secs
cpu shape:4/3 #layers:10 mode:py scale:True train:True - 1.738806 secs
gpu shape:4/3 #layers:10 mode:py scale:True train:True - 0.052777 secs
cpu shape:2/1 #layers:10 mode:py scale:True train:False - 0.119180 secs
gpu shape:2/1 #layers:10 mode:py scale:True train:False - 0.011201 secs
cpu shape:2/1 #layers:10 mode:py scale:True train:True - 0.218297 secs
gpu shape:2/1 #layers:10 mode:py scale:True train:True - 0.048526 secs

After:
cpu shape:4/3 #layers:10 mode:py scale:True train:False - 0.998944 secs
gpu shape:4/3 #layers:10 mode:py scale:True train:False - 0.025828 secs
cpu shape:4/3 #layers:10 mode:py scale:True train:True - 2.657428 secs
gpu shape:4/3 #layers:10 mode:py scale:True train:True - 0.086614 secs
cpu shape:4/3 #layers:10 mode:py scale:True train:False - 0.603137 secs
gpu shape:4/3 #layers:10 mode:py scale:True train:False - 0.017668 secs
cpu shape:4/3 #layers:10 mode:py scale:True train:True - 1.519533 secs
gpu shape:4/3 #layers:10 mode:py scale:True train:True - 0.055214 secs
cpu shape:2/1 #layers:10 mode:py scale:True train:False - 0.071344 secs
gpu shape:2/1 #layers:10 mode:py scale:True train:False - 0.016440 secs
cpu shape:2/1 #layers:10 mode:py scale:True train:True - 0.222093 secs
gpu shape:2/1 #layers:10 mode:py scale:True train:True - 0.039967 secs
Change: 115507032

bce62166

J
Execute TODO to explain graph-consumer usage of · 2cc5ed87
由 Josh Levenberg 提交于 2月 24, 2016
```
RemoveNewDefaultAttrsFromGraphDef().
Change: 115506523
```
2cc5ed87
A
Switch sdca_ops to use tf.load_library mechanism. · 8041c546
由 A. Unique TensorFlower 提交于 2月 24, 2016
```
Change: 115505008
```
8041c546

25 2月, 2016 18 次提交
- B
  Avoid using initialization lists since the version of nvcc shipped with Tegra · 223794ee
  由 Benoit Steiner 提交于 2月 24, 2016
```
X1 crashes when attempting to compile them
Change: 115500414
```
  223794ee
- E
  Surface control_flow_ops.case to public. Update docs. Add unit tests. · 2861cc1d
  由 Eugene Brevdo 提交于 2月 24, 2016
```
Change: 115496194
```
  2861cc1d
- G
  Fix build issue with safety fix to gather and scatter · 49760690
  由 Geoffrey Irving 提交于 2月 24, 2016
```
Change: 115495726
```
  49760690
- E
  Temporarily disable sdca_ops_test - it breaks the opensource build. · 746ccc84
  由 Eugene Brevdo 提交于 2月 24, 2016
```
Change: 115494526
```
  746ccc84
- A
  Support leaving the offset (beta) parameter out in batch_normalization, in... · 4afef14f
  由 A. Unique TensorFlower 提交于 2月 24, 2016
```
Support leaving the offset (beta) parameter out in batch_normalization, in which case no offset will be added after normalization.
Change: 115489328
```
  4afef14f
- A
  removing repeated hostcast lines · 87a28910
  由 A. Unique TensorFlower 提交于 2月 24, 2016
```
Change: 115472914
```
  87a28910
- A
  Rename map in control_flow_ops to map_fn, to avoid name conflict with Python's... · 57df84c4
  由 A. Unique TensorFlower 提交于 2月 24, 2016
```
Rename map in control_flow_ops to map_fn, to avoid name conflict with Python's native 'map' function. This also fixes the bug with control_flow_ops.case
Change: 115472163
```
  57df84c4
- D
  Eliminate unneded pylint disable · 6b2c0012
  由 David G. Andersen 提交于 2月 24, 2016
```
Change: 115470945
```
  6b2c0012
- A
  Update TensorBoard README.md. · 14a237be
  由 A. Unique TensorFlower 提交于 2月 24, 2016
```
Describe how to load many runs.
Change: 115467346
```
  14a237be
- G
  Fix safety bug in gather and scatter · 26078dfa
  由 Geoffrey Irving 提交于 2月 24, 2016
```
Both gather and scatter now unconditionally validate indices in the inner loop,
which prevents crashes if indices are changed asynchronously while the ops are
running.

For gather when validate_indices = true, the new code is within the noise of the
old code speedwise or possibly slightly faster (unsurprising since the new code
fuses two loops).  Specifically, the geometric mean of int32 gather benchmarks
goes from 4.05GB/s to 4.04-4.07GB/s.

For gather when validate_indices = false, the old code and a version of the old
code that supported validate_indices = false both get 1.5% slower.  Xiaoqiang
and I deem this difference insufficient to preserve the unsafe code path, so
poof: it's gone.

For scatter (which always validates), the new code is slightly faster than the
old code: the geometric mean goes from 546-559M items/s to 573M items/s.
Change: 115467091
```
  26078dfa
- Y
  Store only what is needed in the node name to node map. · 8804a486
  由 Yuan Yu 提交于 2月 24, 2016
```
Change: 115464489
```
  8804a486
- E
  Add the OneHot op. · 8411effd
  由 Eugene Brevdo 提交于 2月 24, 2016
```
Change: 115464229
```
  8411effd
- V
  TensorFlow: Initial support in SimplePlacer for colocation groups, · 9d84271a
  由 Vijay Vasudevan 提交于 2月 24, 2016
```
to be used to colocate based on attributes rather than either
names of ops or devices (op names and devices aren't portable).

A follow up change will add an ops.colocate_with() to Python that adds
this attribute to nodes, and will be used to replace calls to 'with
tf.device(foo.device)' in TF library code, which assumes that devices
have been specified.
Change: 115463464
```
  9d84271a
- A
  Fix minor typo in documentation in training_util.py · 92383c87
  由 A. Unique TensorFlower 提交于 2月 24, 2016
```
Change: 115462062
```
  92383c87
- A
  Tests to check linear_optimizer in tf.contrib. · 5f4ec004
  由 A. Unique TensorFlower 提交于 2月 23, 2016
```
Change: 115419426
```
  5f4ec004
- A
  Add correct dependencies to sdca ops to fix build breakage. · 94a992cf
  由 A. Unique TensorFlower 提交于 2月 23, 2016
```
Change: 115408162
```
  94a992cf
- J
  Make core/framework/graph_def_util.h publicly accessible. · 185cff7f
  由 Josh Levenberg 提交于 2月 23, 2016
```
Change: 115384748
```
  185cff7f
- J
  Give tensorflow/core/kernels/ its own BUILD file. · 2408e359
  由 Josh Levenberg 提交于 2月 23, 2016
```
Change: 115379524
```
  2408e359
24 2月, 2016 13 次提交

M

Update protobuf submodule to bring in gRPC support · e7c6ffde
由 Manjunath Kudlur 提交于 2月 23, 2016

e7c6ffde
J
Add tensorboard graph visualization link to graph visualization how-to · 632a7fdc
由 James Wexler 提交于 2月 23, 2016
```
Change: 115371065
```
632a7fdc
A
Fix tslint errors in TensorBoard · c9b6b350
由 A. Unique TensorFlower 提交于 2月 23, 2016
```
Change: 115370821
```
c9b6b350

Swap TensorBoard local build from "tsd" to "typings". · fb0a3f20

由 A. Unique TensorFlower 提交于 2月 23, 2016

Reason: tsd is deprecated (https://github.com/DefinitelyTyped/tsd/issues/269) and typings is the new standard. Also, tsd was behaving badly - running `tsd install` on a clean client was causing it to incorrectly depend on typing files from node_modules, which resulted in a broken build. This issue does not exist with typings.

For convenience, and since typings is really fast when all deps are up-to-date, I made it a part of the standard gulp task. `npm install` so you have all the deps, and running `gulp` will keep the typing files synchronized - there no longer is a separate step for downloading them.

The logical next step is to do the same for bower. I did wire that up, but I will not connect it to the gulp task until after the big bower dependency upgrade CL is through. If I add it right now, it will fail on unresolved dependency conflicts and make everyone sad.
Change: 115370585

fb0a3f20

V
TensorFlow gflags: add DEFINE_bool as an alias to DEFINE_boolean · 91e1180f
由 Vijay Vasudevan 提交于 2月 23, 2016
```
Change: 115364038
```
91e1180f

Adding tests for TensorFlow tutorials · 98676283

由 A. Unique TensorFlower 提交于 2月 23, 2016

The tutorial Python files are copied to a separate directory and run against Python installation on the system. The script performs basic quality checks, including timeout, accuracy / loss thresholding, and checks for generated checkpoint files and summary files.
Change: 115362939

98676283

Memory swapping between GPU and CPU for stack. Changed stack push and pop ops to async. · 803f9a4b

由 Yuan Yu 提交于 2月 23, 2016

For gradient computation for loops, stacks are used to store the tensors that are computed in the forward but needed in backprop. This CL enables very long sequence training by swapping the stack tensors from GPU to CPU.
Change: 115359847

803f9a4b

D
Fix build rules for C++ tests in the OSS build. · 2b9f6091
由 Derek Murray 提交于 2月 23, 2016
```
Change: 115358623
```
2b9f6091
J
Update graph visualizer tutorial to include new interactions. · 6643b804
由 James Wexler 提交于 2月 23, 2016
```
Change: 115354844
```
6643b804
Y
Disable shape inference for accumulator when the shape is not fully known statically. · b1496f68
由 Yuan Yu 提交于 2月 23, 2016
```
Change: 115351830
```
b1496f68
G
Reimplement StrippedOpListForGraph in C++ · 08226306
由 Geoffrey Irving 提交于 2月 23, 2016
```
Change: 115347996
```
08226306

Adds a C++ test utility for picking an unused port. · 35fa8c4e

由 Derek Murray 提交于 2月 23, 2016

The PickUnusedPortOrDie implementation is based on a simplified
version of `grpc_pick_unused_port_or_die()` in gRPC. This utility will
be necessary for tests of the distributed runtime (issue #23).
Change: 115345502

35fa8c4e

Adding tensor shapes to the graph visualizer and to the info card. · cc36921c

由 Dan Smilkov 提交于 2月 23, 2016

Also:
- rename sceneBehavior -> sceneElement to make it clearer it is a polymer element.
- improve the info card by showing the actual op node in the successors/predecessors list when the metaedge only contains one base edge (one tensor).
Change: 115339805

cc36921c

qq_38905368 / tensorflow 与 Fork 源项目一致

qq_38905368 / tensorflow
与 Fork 源项目一致