- 30 3月, 2016 9 次提交
-
-
由 Josh Levenberg 提交于
Also fix some warnings about unsafe conversions. Change: 118519927
-
由 A. Unique TensorFlower 提交于
Add a test that we actually check for array length (we were not.) Change: 118518452
-
由 A. Unique TensorFlower 提交于
// OLD Benchmark Time(ns) CPU(ns) Iterations -------------------------------------------------------------------- BM_ConvFloatDepthwiseBkInCPU1_conv0 207770233 207338129 100 796.0M items/s 32_112_112_3_8_24_3_3_1_2_cpu1 BM_ConvFloatDepthwiseBkInCPU1_conv1 715403538 713939287 100 616.4M items/s 32_112_112_64_1_64_3_3_1_2_cpu1 BM_ConvFloatDepthwiseBkInCPU1_conv2 357349749 356594057 100 617.0M items/s 32_56_56_128_1_128_3_3_1_2_cpu1 BM_ConvFloatDepthwiseBkInCPU1_conv3 274697435 274160117 100 802.7M items/s 32_56_56_128_1_128_3_3_2_2_cpu1 BM_ConvFloatDepthwiseBkInCPU1_conv4 87072020 86874244 100 633.1M items/s 32_28_28_128_1_128_3_3_1_2_cpu1 BM_ConvFloatDepthwiseBkInCPU1_conv5 87172482 86948501 100 632.4M items/s 32_14_14_512_1_512_3_3_1_2_cpu1 BM_ConvFloatDepthwiseBkInCPU1_conv6 46763611 46620163 100 589.4M items/s 32_7_7_1024_1_1024_3_3_1_2_cpu1 // NEW 1-thread Benchmark Time(ns) CPU(ns) Iterations -------------------------------------------------------------------- BM_ConvFloatDepthwiseBkInCPU1_conv0 60173061 59839526 100 2.7G items/s 32_112_112_3_8_24_3_3_1_2_cpu1 BM_ConvFloatDepthwiseBkInCPU1_conv1 99396102 99143542 100 4.3G items/s 32_112_112_64_1_64_3_3_1_2_cpu1 BM_ConvFloatDepthwiseBkInCPU1_conv2 39376616 39226953 100 5.5G items/s 32_56_56_128_1_128_3_3_1_2_cpu1 BM_ConvFloatDepthwiseBkInCPU1_conv3 35987577 35843443 100 6.0G items/s 32_56_56_128_1_128_3_3_2_2_cpu1 BM_ConvFloatDepthwiseBkInCPU1_conv4 9665813 9600518 100 5.6G items/s 32_28_28_128_1_128_3_3_1_2_cpu1 BM_ConvFloatDepthwiseBkInCPU1_conv5 12498989 12427035 100 4.3G items/s 32_14_14_512_1_512_3_3_1_2_cpu1 BM_ConvFloatDepthwiseBkInCPU1_conv6 8459759 8397047 100 3.2G items/s 32_7_7_1024_1_1024_3_3_1_2_cpu1 // NEW 4-threads Benchmark Time(ns) CPU(ns) Iterations -------------------------------------------------------------------- BM_ConvFloatDepthwiseBkInCPU4_conv0 30696635 101663830 100 5.3G items/s 32_112_112_3_8_24_3_3_1_2_cpu4 BM_ConvFloatDepthwiseBkInCPU4_conv1 68884630 198616710 100 6.3G items/s 32_112_112_64_1_64_3_3_1_2_cpu4 BM_ConvFloatDepthwiseBkInCPU4_conv2 16948037 50360587 100 12.7G items/s 32_56_56_128_1_128_3_3_1_2_cpu4 BM_ConvFloatDepthwiseBkInCPU4_conv3 15834408 46873689 100 13.6G items/s 32_56_56_128_1_128_3_3_2_2_cpu4 BM_ConvFloatDepthwiseBkInCPU4_conv4 3904734 11659079 167 13.8G items/s 32_28_28_128_1_128_3_3_1_2_cpu4 BM_ConvFloatDepthwiseBkInCPU4_conv5 3482083 12555105 188 15.5G items/s 32_14_14_512_1_512_3_3_1_2_cpu4 BM_ConvFloatDepthwiseBkInCPU4_conv6 2330680 8593020 281 11.5G items/s 32_7_7_1024_1_1024_3_3_1_2_cpu4 Change: 118514706
-
由 Vijay Vasudevan 提交于
instead of 48 (on my machine). Change: 118512022
-
由 Benoit Steiner 提交于
gradients by 1 to 10% depending on the size of the convolution kernel. Change: 118505660
-
由 Fangwei Li 提交于
Change: 118497433
-
由 A. Unique TensorFlower 提交于
- Change the stride, and in_stride template arguments of Eigen SpatialConvolution, SpatialConvolutionBackwardInput, and SpatialConvolutionBackwardKernel to row_stride, col_stride, row_in_stride, col_in_stride. - Change tensorflow kernels to pass the additional stride parameters. - Rationalize the place where we swap row/col: swap just before calling Eigen. This just enables the plumbing. Non-square strides are still forbidden in the ops. Change: 118484322
-
由 A. Unique TensorFlower 提交于
Parse uploaded pbtxt in chunks to allow for loading of large pbtxt files, this only affects files uploaded with the file chooser menu not XHR requests. Change: 118479897
-
由 David G. Andersen 提交于
having the ops explicitly return a failure that they can't handle overly-large inputs). Most of these should never affect correct tf programs until people get a lot more memory in their machines. Change: 118476613
-
- 29 3月, 2016 31 次提交
-
-
由 A. Unique TensorFlower 提交于
float and int for all types T. (The assumption doesn't hold for Eigen::half.) These were all ops I could find in CPU implementation; probably, some remain for GPU (and some other problems remain before we can turn on Eigen::half for all of them). Change: 118464463
-
由 A. Unique TensorFlower 提交于
Adds a Timeline class which can convert a StepStats protobuf into a JSON-formatted trace. This trace can be loaded into any Chrome web browser via the chrome://tracing URL. Change: 118461709
-
由 A. Unique TensorFlower 提交于
Change: 118445579
-
由 A. Unique TensorFlower 提交于
Change: 118445207
-
由 A. Unique TensorFlower 提交于
implicitly convertible from bool (like Eigen::half). Note: This does not enable half support for ReluOp in itself, although it is an important step of the way. Change: 118444029
-
由 A. Unique TensorFlower 提交于
Change: 118427688
-
由 Benoit Steiner 提交于
Change: 118414827
-
由 Benoit Steiner 提交于
Change: 118414762
-
由 A. Unique TensorFlower 提交于
Change: 118414301
-
由 Yuan Yu 提交于
Moved the higher order functions into functional_ops.py, to avoid some potential circular dependency. Change: 118412058
-
由 A. Unique TensorFlower 提交于
Change: 118410742
-
由 Sherry Moore 提交于
Change: 118409321
-
由 A. Unique TensorFlower 提交于
for Eigen. Change: 118408303
-
由 David G. Andersen 提交于
is_initialized is safe. Change: 118405243
-
由 Geoffrey Irving 提交于
This moves the "Other..." documentation of make_all from tf.image to tf.contrib.layers, since tf.contrib.layers has no __all__ and thus leaks all sorts of random things. Change: 118403996
-
由 Derek Murray 提交于
This change accounts for the fact that the `RequestCancelled` callback will still be called after a `RequestReceived(ok = false)` callback. In refactoring this code, this change also adds the ability for enqueued requests to enable cancellation selectively, which reduces the allocation count of completion queue tags for client-side cancellation that serve no useful purpose. Only methods that have a server-side implementation of cancellation should register a cancellation tag. Change: 118403144
-
由 A. Unique TensorFlower 提交于
Change: 118401805
-
由 A. Unique TensorFlower 提交于
Change: 118401706
-
由 Eugene Brevdo 提交于
Change: 118401267
-
由 A. Unique TensorFlower 提交于
Change: 118397198
-
由 A. Unique TensorFlower 提交于
collection list. The previous behavior was to return the list itself: If other items were later added to the collection the list returned initially would show them. Add get_collection_ref() to return the collection list itself. Change: 118396574
-
由 Josh Levenberg 提交于
Change: 118393590
-
由 Andrew Harp 提交于
Build android_tensorflow_lib_lite with -Os to reduce the size of libandroid_tensorflow_lib.so. Kernel code in android_tensorflow_lib is still built with -O2 for performance reasons. Change: 118389861
-
由 A. Unique TensorFlower 提交于
be built depending on these rules. Otherwise the ops and some other components are dropped because there are no link-time references to them. This change has no effect on .so files needed by normal Android NDK Java apps. Change: 118387009
-
由 A. Unique TensorFlower 提交于
instead of the filename. Change FindKernelDef to also return the class name, to help a tool that generates ops_to_register.h to find the set of class names. Change: 118381472
-
由 Jianmin Chen 提交于
Change: 118376780
-
由 Andrew Harp 提交于
Change: 118376252
-
由 Dan Smilkov 提交于
Change: 118375255
-
由 Dan Smilkov 提交于
Bring back fast_cpp_protos bazel configuration and remove 64MB limit of protobufs. This speeds-up graph serialization by ~15x for users building TensorFlow from source. Note that you can now install a faster pip binary for protobuf using the instructions in https://github.com/tensorflow/tensorflow/commit/8ac009728db931ef3119a337bd23250c89bc7efe This only affects building and running from within the bazel environment. Change: 118374862
-
由 Josh Levenberg 提交于
Change: 118369028
-
由 Eugene Brevdo 提交于
allows variable sized inputs and pads upon dequeue. Added unit tests. During testing, identified a small C++ bug where TensorMap slice fails if the slice shape has a zero length somewhere. For now, avoid this error by checking the slice's size and returning early if the slice is empty (this is the correct thing to do). Change: 118362981
-