提交 · 01a6f5e504d9299395888a786e52c589c16af529 · qq_38905368 / tensorflow

26 2月, 2016 34 次提交

X
Multiple layout support for pooling operations. · 01a6f5e5
由 Xiaoqiang Zheng 提交于 2月 25, 2016
```
Change: 115611259
```
01a6f5e5
D
Fix compilation error in argv parsing code... whoops. · cdd0f2ee
由 Derek Murray 提交于 2月 25, 2016
```
Change: 115610448
```
cdd0f2ee
A
Add symbolic gradient functions for Conv2D and MaxPool · aeae4825
由 A. Unique TensorFlower 提交于 2月 25, 2016
```
Change: 115608522
```
aeae4825
V
TensorFlow: conv_ops uses gpu_device_context, needs to depend on the lib. · 03fed366
由 Vijay Vasudevan 提交于 2月 25, 2016
```
Change: 115607974
```
03fed366
D
Correct handling of argv in test utility. · f3ead2df
由 Derek Murray 提交于 2月 25, 2016
```
Change: 115607801
```
f3ead2df
V
Rollback of "TestReporter is back in. Maybe also fixed the Android build." · c38bbf42
由 Vijay Vasudevan 提交于 2月 25, 2016
```
Test fails.
Change: 115602477
```
c38bbf42

Rollback of: Add native depthwise_convolution op (forward pass). · 90cf3e2e

由 Vijay Vasudevan 提交于 2月 25, 2016

The current depthwise_conv is very inefficient by calling slice() on each
input channel on input and filters, followed by a conv() on each input channel,
after which is a concat().
Change: 115601904

90cf3e2e

V
TensorFlow: perl command in configure script was not properly replacing · 97f6b6fb
由 Vijay Vasudevan 提交于 2月 25, 2016
```
the cuda/cudnn version strings in build_config.bzl

Addresses issue mentioned in #1052
Change: 115599314
```
97f6b6fb
V
Make gpu_lib for non-cuda deps that we use in public kernels. · a82f7e6b
由 Vijay Vasudevan 提交于 2月 25, 2016
```
Change: 115598732
```
a82f7e6b
S
Clarify comments for max_to_keep. · a5f39790
由 Sherry Moore 提交于 2月 25, 2016
```
Change: 115598592
```
a5f39790
M
Remove endl at the end of VLOG. · be64da94
由 Manjunath Kudlur 提交于 2月 25, 2016
```
Change: 115594986
```
be64da94
J
Execute TODO to rename io.* to save_restore_tensor.*. This will · eec5477a
由 Josh Levenberg 提交于 2月 25, 2016
```
hopefully reduce confusion since io.* is not the implementation of the
".../kernels:io" build target.
Change: 115593814
```
eec5477a
E
TestReporter is back in. Maybe also fixed the Android build. · ad3ef4c0
由 Eugene Brevdo 提交于 2月 25, 2016
```
Change: 115589642
```
ad3ef4c0
M

Updated protobuf submodule to fb714b3 to bring in updates to grpc support · b97931cb
由 Manjunath Kudlur 提交于 2月 25, 2016

b97931cb
V
TensorFlow: add missing header file to posix/test.cc · 356bf7f4
由 Vijay Vasudevan 提交于 2月 25, 2016
```
Change: 115589382
```
356bf7f4

Add native depthwise_convolution op (forward pass). · 7b47c8b4

由 Jianmin Chen 提交于 2月 25, 2016

The current depthwise_conv is very inefficient by calling slice() on each
input channel on input and filters, followed by a conv() on each input channel,
after which is a concat().
Change: 115583330

7b47c8b4

D
Changed testing::SrcDir() to testing::TensorFlowSourceRoot() and fixed it. · 818644c2
由 Derek Murray 提交于 2月 25, 2016
```
Also fixed some compiler warnings.
Change: 115582482
```
818644c2
V
TensorFlow: make split_op not use internal header library for callback, · 13d7f520
由 Vijay Vasudevan 提交于 2月 25, 2016
```
since this breaks the build on GPU.
Change: 115582331
```
13d7f520
V
TensorFlow: Fix scatter_op_test now that StringPiece::contains is fixed. · 86e93feb
由 Vijay Vasudevan 提交于 2月 25, 2016
```
Change: 115580211
```
86e93feb
A
Add contrib/testing. · d1aed650
由 A. Unique TensorFlower 提交于 2月 25, 2016
```
Change: 115578243
```
d1aed650
A
Avoid some over-inlined routines. Reduces code size of TensorFlow binaries · 9ccc4b6a
由 A. Unique TensorFlower 提交于 2月 25, 2016
```
considerably.  Shrinks text size of example_trainer binary by ~1.5%.
Change: 115578002
```
9ccc4b6a
B
Made sure that the tracking allocator always counts the allocated sizes. · 63bd3efc
由 Benoit Steiner 提交于 2月 25, 2016
```
Made the corresponding unit test more robust.
Change: 115575179
```
63bd3efc
V
TensorFlow: fix bug in StringPiece::contains which made it always · 5c9f4f89
由 Vijay Vasudevan 提交于 2月 25, 2016
```
return true.  Add a unittest to catch this type of regression in
the future.
Change: 115573280
```
5c9f4f89
A
Fix for constant folding where nodes with no inputs doesn't get constant folded. · 82ecfff7
由 A. Unique TensorFlower 提交于 2月 25, 2016
```
Change: 115568214
```
82ecfff7
A
Fixes bug in accumulation of total-approximate-duality-gap. · e752109e
由 A. Unique TensorFlower 提交于 2月 24, 2016
```
Change: 115528686
```
e752109e

Fix an error message in tf.sparse_to_dense to include the possibility that... · 73d557cc

由 A. Unique TensorFlower 提交于 2月 24, 2016

Fix an error message in tf.sparse_to_dense to include the possibility that indices are invalid because they are out of bounds.
Change: 115522264

73d557cc

Added TestReporter and test / benchmark reporting tools. · fcfa866d

由 Eugene Brevdo 提交于 2月 24, 2016

These tools are meant to allow recording of benchmark & unit test
structured output to pbtxt files in a directory only when the
environment variable TEST_REPORT_FILE_PREFIX is set.  For now,
only saving of C++  microbenchmark output is supported.
Change: 115518303

fcfa866d

S
Added unit test for max_to_keep being None. · 4ecd2a70
由 Sherry Moore 提交于 2月 24, 2016
```
Change: 115516426
```
4ecd2a70
K
Move all Tensorflow WORKSPACE rules to a skylark macro · 77da168d
由 Kiril Gorovoy 提交于 2月 24, 2016
```
Change: 115515678
```
77da168d
J
Remove no-longer-needed RequireDefaultOps(). · 9ba55d8a
由 Josh Levenberg 提交于 2月 24, 2016
```
Change: 115511835
```
9ba55d8a
J
Remove no-longer-needed RequireDefaultOps(). · ab286e09
由 Josh Levenberg 提交于 2月 24, 2016
```
Change: 115511794
```
ab286e09

Switch nn.moments() to using a one-pass stable algorithm. · bce62166

由 Vincent Vanhoucke 提交于 2月 24, 2016

Helps with: https://github.com/tensorflow/tensorflow/issues/917
Also fixes https://github.com/tensorflow/tensorflow/issues/1162

The main benefit is that the computation of the sufficient statistics is now decoupled of the aggregation of the moments, which means that if you want to perform the accumulation incrementally, you don't have to keep all the inputs around, and can instead keep the much more compact sum and sum-of-squares. Accumulation could also be performed locally if you aggregate across multiple devices.
Computing sum and sum-of-squares can also theoretically be performed in parallel now.

Tested running inception: same performance, same step time.
Batch normalization benchmark is a bit faster on CPU, a bit slower on GPU:

Before:
cpu shape:4/3 #layers:10 mode:py scale:True train:False - 1.139310 secs
gpu shape:4/3 #layers:10 mode:py scale:True train:False - 0.021970 secs
cpu shape:4/3 #layers:10 mode:py scale:True train:True - 2.767147 secs
gpu shape:4/3 #layers:10 mode:py scale:True train:True - 0.074531 secs
cpu shape:4/3 #layers:10 mode:py scale:True train:False - 0.742835 secs
gpu shape:4/3 #layers:10 mode:py scale:True train:False - 0.013473 secs
cpu shape:4/3 #layers:10 mode:py scale:True train:True - 1.738806 secs
gpu shape:4/3 #layers:10 mode:py scale:True train:True - 0.052777 secs
cpu shape:2/1 #layers:10 mode:py scale:True train:False - 0.119180 secs
gpu shape:2/1 #layers:10 mode:py scale:True train:False - 0.011201 secs
cpu shape:2/1 #layers:10 mode:py scale:True train:True - 0.218297 secs
gpu shape:2/1 #layers:10 mode:py scale:True train:True - 0.048526 secs

After:
cpu shape:4/3 #layers:10 mode:py scale:True train:False - 0.998944 secs
gpu shape:4/3 #layers:10 mode:py scale:True train:False - 0.025828 secs
cpu shape:4/3 #layers:10 mode:py scale:True train:True - 2.657428 secs
gpu shape:4/3 #layers:10 mode:py scale:True train:True - 0.086614 secs
cpu shape:4/3 #layers:10 mode:py scale:True train:False - 0.603137 secs
gpu shape:4/3 #layers:10 mode:py scale:True train:False - 0.017668 secs
cpu shape:4/3 #layers:10 mode:py scale:True train:True - 1.519533 secs
gpu shape:4/3 #layers:10 mode:py scale:True train:True - 0.055214 secs
cpu shape:2/1 #layers:10 mode:py scale:True train:False - 0.071344 secs
gpu shape:2/1 #layers:10 mode:py scale:True train:False - 0.016440 secs
cpu shape:2/1 #layers:10 mode:py scale:True train:True - 0.222093 secs
gpu shape:2/1 #layers:10 mode:py scale:True train:True - 0.039967 secs
Change: 115507032

bce62166

J
Execute TODO to explain graph-consumer usage of · 2cc5ed87
由 Josh Levenberg 提交于 2月 24, 2016
```
RemoveNewDefaultAttrsFromGraphDef().
Change: 115506523
```
2cc5ed87
A
Switch sdca_ops to use tf.load_library mechanism. · 8041c546
由 A. Unique TensorFlower 提交于 2月 24, 2016
```
Change: 115505008
```
8041c546

25 2月, 2016 6 次提交
- B
  Avoid using initialization lists since the version of nvcc shipped with Tegra · 223794ee
  由 Benoit Steiner 提交于 2月 24, 2016
```
X1 crashes when attempting to compile them
Change: 115500414
```
  223794ee
- E
  Surface control_flow_ops.case to public. Update docs. Add unit tests. · 2861cc1d
  由 Eugene Brevdo 提交于 2月 24, 2016
```
Change: 115496194
```
  2861cc1d
- G
  Fix build issue with safety fix to gather and scatter · 49760690
  由 Geoffrey Irving 提交于 2月 24, 2016
```
Change: 115495726
```
  49760690
- E
  Temporarily disable sdca_ops_test - it breaks the opensource build. · 746ccc84
  由 Eugene Brevdo 提交于 2月 24, 2016
```
Change: 115494526
```
  746ccc84
- A
  Support leaving the offset (beta) parameter out in batch_normalization, in... · 4afef14f
  由 A. Unique TensorFlower 提交于 2月 24, 2016
```
Support leaving the offset (beta) parameter out in batch_normalization, in which case no offset will be added after normalization.
Change: 115489328
```
  4afef14f
- A
  removing repeated hostcast lines · 87a28910
  由 A. Unique TensorFlower 提交于 2月 24, 2016
```
Change: 115472914
```
  87a28910

qq_38905368 / tensorflow 与 Fork 源项目一致

qq_38905368 / tensorflow
与 Fork 源项目一致