If you're a little confused about the buzz words machine learning, deep learning, machine intelligence, and artificial intelligence (AI), here's a quick summary: machine intelligence and AI are really just the same thing; machine learning is a field, also the most popular one, of AI; deep learning is one special type of machine learning, and is also the modern and most effective approach to solving complicated problems such as computer vision, speech recognition and synthesis, and natural language processing. So in this book, when we say AI, we primarily mean deep learning, the savior that took AI from the long winter to the summer. For more information about the AI winter and deep learning, you can check out [https://en.wikipedia.org/wiki/AI_winter](https://en.wikipedia.org/wiki/AI_winter) and [http://www.deeplearningbook.org](http://www.deeplearningbook.org).
If you're a little confused about the buzz words machine learning, deep learning, machine intelligence, and artificial intelligence (AI), here's a quick summary: machine intelligence and AI are really just the same thing; machine learning is a field, also the most popular one, of AI; deep learning is one special type of machine learning, and is also the modern and most effective approach to solving complicated problems such as computer vision, speech recognition and synthesis, and natural language processing. So in this book, when we say AI, we primarily mean deep learning, the savior that took AI from the long winter to the summer. For more information about the AI winter and deep learning, you can check out [https://en.wikipedia.org/wiki/AI_winter](https://en.wikipedia.org/wiki/AI_winter) and [http://www.deeplearningbook.org](http://www.deeplearningbook.org).
TensorFlow 可以安装在 MacOS,Ubuntu 或 Windows 上。 我们将介绍在 MacOS X El Capitan(10.11.6),macOS Sierra(10.12.6)和 Ubuntu 16.04 上从源代码安装 TensorFlow 1.4 的步骤。 如果您使用其他操作系统或版本,则可以参考 TensorFlow 安装( [https://www.tensorflow.org/install](https://www.tensorflow.org/install) )文档以获取更多信息。 当您阅读本书时,可能会出现更新的 TensorFlow 版本。 尽管您仍然应该能够使用较新版本运行本书中的代码,但这并不能保证,因此我们在 Mac 和 Ubuntu 上使用 TensorFlow 1.4 发行源代码来设置 TensorFlow; 这样,您可以轻松地测试运行并与书中的应用程序一起玩。
TensorFlow 可以安装在 MacOS,Ubuntu 或 Windows 上。 我们将介绍在 MacOS X El Capitan(10.11.6),macOS Sierra(10.12.6)和 Ubuntu 16.04 上从源代码安装 TensorFlow 1.4 的步骤。 如果您使用其他操作系统或版本,则可以参考 [TensorFlow 安装文档](https://www.tensorflow.org/install)以获取更多信息。 当您阅读本书时,可能会出现更新的 TensorFlow 版本。 尽管您仍然应该能够使用较新版本运行本书中的代码,但这并不能保证,因此我们在 Mac 和 Ubuntu 上使用 TensorFlow 1.4 发行源代码来设置 TensorFlow; 这样,您可以轻松地测试运行并与书中的应用程序一起玩。
Since we wrote the paragraph above in December 2017, there have been four new official releases of TensorFlow (1.5, 1.6, 1.7, and 1.8), which you can download at [https://github.com/tensorflow/tensorflow/releases](https://github.com/tensorflow/tensorflow/releases) or from the TensorFlow source code repo ([https://github.com/tensorflow/tensorflow](https://github.com/tensorflow/tensorflow)), and a new version of Xcode (9.3) as of May 2018\. Newer versions of TensorFlow, such as 1.8, by default support newer versions of NVIDIA CUDA and cuDNN (see the section *Setting up TensorFlow on GPU-powered Ubuntu* for detail), and you'd better follow the official TensorFlow documentation to install the latest TensorFlow version with GPU support. In this and the following chapters, we may refer to a specific TensorFlow version as an example, but will keep all iOS, Android, and Python code tested and, if needed, updated for the latest TensorFlow, Xcode, and Android Studio versions in the book's source code repo at [https://github.com/jeffxtang/mobiletfbook](https://github.com/jeffxtang/mobiletfbook).
Since we wrote the paragraph above in December 2017, there have been four new official releases of TensorFlow (1.5, 1.6, 1.7, and 1.8), which you can download at [https://github.com/tensorflow/tensorflow/releases](https://github.com/tensorflow/tensorflow/releases) or from the TensorFlow source code repo ([https://github.com/tensorflow/tensorflow](https://github.com/tensorflow/tensorflow)), and a new version of Xcode (9.3) as of May 2018\. Newer versions of TensorFlow, such as 1.8, by default support newer versions of NVIDIA CUDA and cuDNN (see the section *Setting up TensorFlow on GPU-powered Ubuntu* for detail), and you'd better follow the official TensorFlow documentation to install the latest TensorFlow version with GPU support. In this and the following chapters, we may refer to a specific TensorFlow version as an example, but will keep all iOS, Android, and Python code tested and, if needed, updated for the latest TensorFlow, Xcode, and Android Studio versions in the book's source code repo at [https://github.com/jeffxtang/mobiletfbook](https://github.com/jeffxtang/mobiletfbook).
...
@@ -49,7 +49,7 @@ Since we wrote the paragraph above in December 2017, there have been four new of
...
@@ -49,7 +49,7 @@ Since we wrote the paragraph above in December 2017, there have been four new of
An alternative to setting up your own GPU-powered Ubuntu with TensorFlow is to use TensorFlow in a GPU-enabled cloud service such as Google Cloud Platform's Cloud ML Engine ([https://cloud.google.com/ml-engine/docs/using-gpus](https://cloud.google.com/ml-engine/docs/using-gpus)). There are pros and cons of each option. Cloud services are generally time-based billing. If your goal is to train or retrain models to be deployed on mobile devices, meaning the models are not super complicated, and if you plan to do machine learning training for a long time, it'd be more cost effective and satisfying to have your own GPU.
An alternative to setting up your own GPU-powered Ubuntu with TensorFlow is to use TensorFlow in a GPU-enabled cloud service such as Google Cloud Platform's Cloud ML Engine ([https://cloud.google.com/ml-engine/docs/using-gpus](https://cloud.google.com/ml-engine/docs/using-gpus)). There are pros and cons of each option. Cloud services are generally time-based billing. If your goal is to train or retrain models to be deployed on mobile devices, meaning the models are not super complicated, and if you plan to do machine learning training for a long time, it'd be more cost effective and satisfying to have your own GPU.
请按照以下步骤在 Ubuntu 16.04 上安装 CUDA 8.0 和 cuDNN 6.0(您应该能够以类似的方式下载并安装 CUDA 9.0 和 cuDNN 7.0):
请按照以下步骤在 Ubuntu 16.04 上安装 CUDA 8.0 和 cuDNN 6.0(您应该能够以类似的方式下载并安装 CUDA 9.0 和 cuDNN 7.0):
1. 在[https://developer.nvidia.com/cuda-80-ga2-download-archive](https://developer.nvidia.com/cuda-80-ga2-download-archive)中找到 NVIDIA CUDA 8.0 GA2 版本,并进行以下屏幕截图中所示的选择:
1. 在[这个页面](https://developer.nvidia.com/cuda-80-ga2-download-archive)中找到 NVIDIA CUDA 8.0 GA2 版本,并进行以下屏幕截图中所示的选择:
![](img/0e74beec-5ac6-4755-8268-dcf83c27a700.png)Fig 1.1 Getting ready to download CUDA 8.0 on Ubuntu 16.04
![](img/0e74beec-5ac6-4755-8268-dcf83c27a700.png)Fig 1.1 Getting ready to download CUDA 8.0 on Ubuntu 16.04
If you're unfamiliar with CNN, check out the videos and notes of one of the best resources on it, the Stanford CS231n course *CNN for Visual Recognition* ([http://cs231n.stanford.edu](http://cs231n.stanford.edu)). Another good resource on CNN is Chapter 6 of *Michael Nielsen's* online book, *Neural Networks and Deep Learning*: [http://neuralnetworksanddeeplearning.com/chap6.html#introducing_convolutional_networks](http://neuralnetworksanddeeplearning.com/chap6.html#introducing_convolutional_networks).
If you're unfamiliar with CNN, check out the videos and notes of one of the best resources on it, the Stanford CS231n course *CNN for Visual Recognition* ([http://cs231n.stanford.edu](http://cs231n.stanford.edu)). Another good resource on CNN is Chapter 6 of *Michael Nielsen's* online book, *Neural Networks and Deep Learning*: [http://neuralnetworksanddeeplearning.com/chap6.html#introducing_convolutional_networks](http://neuralnetworksanddeeplearning.com/chap6.html#introducing_convolutional_networks).
@@ -39,7 +39,7 @@ Andrej Karpathy wrote a good introduction to RCNN, "Playing around with RCNN, St
...
@@ -39,7 +39,7 @@ Andrej Karpathy wrote a good introduction to RCNN, "Playing around with RCNN, St
If you’re really interested in deep learning research and want to know all the details of how each detector works to decide which one to use, you should definitely read the papers of each method and try to reproduce the training process on your own. It’ll be a long but rewarding road. But if you want to take Andrej Karpathy’s advice, “don’t be a hero” (search on YouTube for “deep learning for computer vision Andrej”), then you can “take whatever works best, download a pre-trained model, potentially add/delete some parts of it, and fine-tune it on your app,” which is also the approach we’ll use here.
If you’re really interested in deep learning research and want to know all the details of how each detector works to decide which one to use, you should definitely read the papers of each method and try to reproduce the training process on your own. It’ll be a long but rewarding road. But if you want to take Andrej Karpathy’s advice, “don’t be a hero” (search on YouTube for “deep learning for computer vision Andrej”), then you can “take whatever works best, download a pre-trained model, potentially add/delete some parts of it, and fine-tune it on your app,” which is also the approach we’ll use here.
@@ -51,7 +51,7 @@ If you’re really interested in deep learning research and want to know all the
...
@@ -51,7 +51,7 @@ If you’re really interested in deep learning research and want to know all the
TensorFlow 对象检测 API 在其官方网站 [https://github.com/tensorflow/models/tree/master/research/object_detection](https://github.com/tensorflow/models/tree/master/research/object_detection)上有详细记录,您一定要查看其“快速入门: Jupyter notebook for the 现成的推断”指南,快速介绍了如何在 Python 中使用良好的预训练模型进行检测。 但是那里的文档分布在许多不同的页面上,有时难以理解。 在本节和下一节中,我们将通过重组在许多地方记录的重要细节并添加更多示例和代码说明来简化官方文档,并提供有关以下内容的两个分步教程:
TensorFlow 对象检测 API 在其[官方网站](https://github.com/tensorflow/models/tree/master/research/object_detection)上有详细记录,您一定要查看其“快速入门: Jupyter notebook for the 现成的推断”指南,快速介绍了如何在 Python 中使用良好的预训练模型进行检测。 但是那里的文档分布在许多不同的页面上,有时难以理解。 在本节和下一节中,我们将通过重组在许多地方记录的重要细节并添加更多示例和代码说明来简化官方文档,并提供有关以下内容的两个分步教程:
尽管原始神经样式转换算法的结果令人惊叹,但其性能却很差-训练是样式转换图像生成过程的一部分,通常在 GPU 上花费几分钟,在 CPU 上花费约一个小时才能生成良好的图像。 结果。
尽管原始神经样式转换算法的结果令人惊叹,但其性能却很差-训练是样式转换图像生成过程的一部分,通常在 GPU 上花费几分钟,在 CPU 上花费约一个小时才能生成良好的图像。 结果。
If you're interested in the details of the original algorithm, you can read the paper along with a well-documented Python implementation, at [https://github.com/log0/neural-style-painting/blob/master/art.py](https://github.com/log0/neural-style-painting/blob/master/art.py). We won't discuss this original algorithm as it's not feasible to run on mobile phone, but it's fun and instrumental to try it to get a better understanding of how to use a pre-trained deep CNN model for different computer vision tasks.
If you're interested in the details of the original algorithm, you can read the paper along with a well-documented Python implementation, at [https://github.com/log0/neural-style-painting/blob/master/art.py](https://github.com/log0/neural-style-painting/blob/master/art.py). We won't discuss this original algorithm as it's not feasible to run on mobile phone, but it's fun and instrumental to try it to get a better understanding of how to use a pre-trained deep CNN model for different computer vision tasks.
RNN allows us to handle sequences of input and/or output, because the network, by design, has memory of previous items in an input sequence or can generate a sequence of output. This makes RNN more appropriate for speech recognition, (where the input is a sequence of words uttered by users), image captioning, (where the output is a natural language sentence consisting of a series of words), text generation, and time series prediction. If you're unfamiliar with RNN, you should definitely check out *Andrey Karpathy's* blog, *The Unreasonable Effectiveness of Recurrent Neural Networks* ([http://karpathy.github.io/2015/05/21/rnn-effectiveness](http://karpathy.github.io/2015/05/21/rnn-effectiveness)). We'll also cover some detailed RNN models later in the book.
RNN allows us to handle sequences of input and/or output, because the network, by design, has memory of previous items in an input sequence or can generate a sequence of output. This makes RNN more appropriate for speech recognition, (where the input is a sequence of words uttered by users), image captioning, (where the output is a natural language sentence consisting of a series of words), text generation, and time series prediction. If you're unfamiliar with RNN, you should definitely check out *Andrey Karpathy's* blog, *The Unreasonable Effectiveness of Recurrent Neural Networks* ([http://karpathy.github.io/2015/05/21/rnn-effectiveness](http://karpathy.github.io/2015/05/21/rnn-effectiveness)). We'll also cover some detailed RNN models later in the book.
The speech commands dataset is collected from an Open Speech Recording site ([https://aiyprojects.withgoogle.com/open_speech_recording](https://aiyprojects.withgoogle.com/open_speech_recording)). You should give it a try and maybe contribute a few minutes of your own recordings to help it improve and also get a sense of how you can collect your own speech commands dataset if needed. There's also a Kaggle competition ([https://www.kaggle.com/c/tensorflow-speech-recognition-challenge](https://www.kaggle.com/c/tensorflow-speech-recognition-challenge)) on using the dataset to build a model and you can learn more about speech models and tips there.
The speech commands dataset is collected from an Open Speech Recording site ([https://aiyprojects.withgoogle.com/open_speech_recording](https://aiyprojects.withgoogle.com/open_speech_recording)). You should give it a try and maybe contribute a few minutes of your own recordings to help it improve and also get a sense of how you can collect your own speech commands dataset if needed. There's also a Kaggle competition ([https://www.kaggle.com/c/tensorflow-speech-recognition-challenge](https://www.kaggle.com/c/tensorflow-speech-recognition-challenge)) on using the dataset to build a model and you can learn more about speech models and tips there.
As mobile developers, you probably don't need to understand DFT and FFT. But you'd better appreciate how all this model training works when used in mobile apps by knowing that behind the scenes of the TensorFlow simple speech commands model training that we're about to cover, it's the use of FFT, one of the top 10 algorithms in the 20th century, among other things of course, that makes the CNN-based speech command recognition model training possible. For a fun and intuitive tutorial on DFT, you can read this article: [http://practicalcryptography.com/miscellaneous/machine-learning/intuitive-guide-discrete-fourier-transform](http://practicalcryptography.com/miscellaneous/machine-learning/intuitive-guide-discrete-fourier-transform) .
As mobile developers, you probably don't need to understand DFT and FFT. But you'd better appreciate how all this model training works when used in mobile apps by knowing that behind the scenes of the TensorFlow simple speech commands model training that we're about to cover, it's the use of FFT, one of the top 10 algorithms in the 20th century, among other things of course, that makes the CNN-based speech command recognition model training possible. For a fun and intuitive tutorial on DFT, you can read this article: [http://practicalcryptography.com/miscellaneous/machine-learning/intuitive-guide-discrete-fourier-transform](http://practicalcryptography.com/miscellaneous/machine-learning/intuitive-guide-discrete-fourier-transform) .
@@ -772,7 +772,7 @@ func audioRecorderDidFinishRecording(_ recorder: AVAudioRecorder, successfully f
...
@@ -772,7 +772,7 @@ func audioRecorderDidFinishRecording(_ recorder: AVAudioRecorder, successfully f
}
}
```
```
如果您确实想将尽可能多的代码移植到 Swift,则可以用 Swift 替换 C 中的音频文件转换代码(请参见 [https://developer.apple.com/documentation/audiotoolbox/extended_audio_file_services](https://developer.apple.com/documentation/audiotoolbox/extended_audio_file_services) 细节)。 还有一些非官方的开源项目提供了官方 TensorFlow C ++ API 的 Swift 包装器。 但是为了简单起见和达到适当的平衡,我们将保持 TensorFlow 模型的推论,在本示例中,还将保持音频文件的读取和转换,以及在 C ++和 Objective-C 中与控制 UI 和 录音,并启动呼叫以进行音频处理和识别。
如果您确实想将尽可能多的代码移植到 Swift,[则可以用 Swift 替换 C 中的音频文件转换代码](https://developer.apple.com/documentation/audiotoolbox/extended_audio_file_services)。 还有一些非官方的开源项目提供了官方 TensorFlow C ++ API 的 Swift 包装器。 但是为了简单起见和达到适当的平衡,我们将保持 TensorFlow 模型的推论,在本示例中,还将保持音频文件的读取和转换,以及在 C ++和 Objective-C 中与控制 UI 和 录音,并启动呼叫以进行音频处理和识别。
这就是构建使用语音命令识别模型的 Swift iOS 应用所需的全部内容。 现在,您可以在 iOS 模拟器或实际设备上运行它,并看到与 Objective-C 版本完全相同的结果。
这就是构建使用语音命令识别模型的 Swift iOS 应用所需的全部内容。 现在,您可以在 iOS 模拟器或实际设备上运行它,并看到与 Objective-C 版本完全相同的结果。