提交 5ae6d0c1 编写于 作者: wnma3mz's avatar wnma3mz

add linkedi.com

上级 ea6ca1c8
![](https://media.licdn.com/dms/image/C4D12AQG6uImV4n37jg/article-inline_image-shrink_1000_1488/0?e=1560384000&v=beta&t=YaOaA1VZ0RZelhM9k4TiJtzfevgcnqAhiIWy0zLPDUc)
Graphic taken from Data Science Central
The picture is accurate, but the more relevant question is “When would each technique be at an advantage?”  
The obvious difference, correctly depicted, is that the Deep Neural Network is estimating many more parameters and even more permutations of parameters than the logistic regression. Therefore the real question is in what situations would that be a good idea? 
You need a good ratio of data points to parameters to get reliable estimates so the first criteria would be lots of data in order to estimate lots of parameters. If that's not true then you'd be estimating lots of parameters with little data per parameter and get a bunch of spurious results. Therefore depending upon the situation, the additional granularity of the Deep Neural Network would either represent a treasure trove of additional detail and value, or an error prone and misleading representation of the situation. 
The second key difference is the need to understand “why the prediction works” or the need to restrict the equation from using certain data in specific ways. We've all heard the example of drownings and ice cream sales being correlated together because more people both swim and drown in the summer time and also eat more ice cream in the summer. Ice cream sales might help indicate “when people will drown”, but it’s not going to indicate “why people are drowning”. The need to know “why” means that it’s important to restrict the ways data is used and assure logical inference. The more convoluted the formula and the less involved the analyst is the less you’ll be able to understand what caused what or why a prediction works and when it might stop working. Understanding the why is better suited to more parsimonious techniques with careful involvement of the analyst.
On the other hand sometimes the “why” is not as important as simply “what is”. A ground breaking application of Deep Neural Networks is in the area of machine vision or the correct classification of pictures or the translation of video into analyzable data. Pictures and video have enormous amounts of information and detail. So much so that it’s very difficult to effectively use it all without heavy automation. That’s the perfect fit for a Deep Learning Neural Network.
Both techniques, as well as their many cousins, have tremendous opportunities to add value if applied to the problems they’re best suited for and conversely, as with any technique, they could also lead to problems if naively applied inappropriately.
If you liked this discussion, I’d appreciate you sharing it or clicking the “like” button. Your vote of approval helps spread the publicity and is always appreciated and useful in the prioritization of further content.
**David Young has worked in Marketing Analytics 20+ years and lives in Vienna, VA**
If you enjoyed this you might enjoy my book:
Book preview and always in stock at the Publisher: [https://store.bookbaby.com//bookshop/book/index.aspx?bookURL=A-Short-Guide-to-Marketing-Model-Alignment-and-Design](https://store.bookbaby.com/bookshop/book/index.aspx?bookURL=A-Short-Guide-to-Marketing-Model-Alignment-and-Design)
Also at Amazon: [https://www.amazon.com/Short-Guide-Marketing-Alignment-Design/dp/1543912591/ref=sr_1_1?ie=UTF8&qid=1510196791&sr=8-1&keywords=a+short+guide+to+marketing+model+alignment+%26+design](https://www.amazon.com/Short-Guide-Marketing-Alignment-Design/dp/1543912591/ref=sr_1_1?ie=UTF8&qid=1510196791&sr=8-1&keywords=a+short+guide+to+marketing+model+alignment+&+design)
![](https://media.licdn.com/dms/image/C4D12AQG6uImV4n37jg/article-inline_image-shrink_1000_1488/0?e=1560384000&v=beta&t=YaOaA1VZ0RZelhM9k4TiJtzfevgcnqAhiIWy0zLPDUc)
Graphic taken from Data Science Central
The picture is accurate, but the more relevant question is “When would each technique be at an advantage?”  
The obvious difference, correctly depicted, is that the Deep Neural Network is estimating many more parameters and even more permutations of parameters than the logistic regression. Therefore the real question is in what situations would that be a good idea? 
You need a good ratio of data points to parameters to get reliable estimates so the first criteria would be lots of data in order to estimate lots of parameters. If that's not true then you'd be estimating lots of parameters with little data per parameter and get a bunch of spurious results. Therefore depending upon the situation, the additional granularity of the Deep Neural Network would either represent a treasure trove of additional detail and value, or an error prone and misleading representation of the situation. 
The second key difference is the need to understand “why the prediction works” or the need to restrict the equation from using certain data in specific ways. We've all heard the example of drownings and ice cream sales being correlated together because more people both swim and drown in the summer time and also eat more ice cream in the summer. Ice cream sales might help indicate “when people will drown”, but it’s not going to indicate “why people are drowning”. The need to know “why” means that it’s important to restrict the ways data is used and assure logical inference. The more convoluted the formula and the less involved the analyst is the less you’ll be able to understand what caused what or why a prediction works and when it might stop working. Understanding the why is better suited to more parsimonious techniques with careful involvement of the analyst.
On the other hand sometimes the “why” is not as important as simply “what is”. A ground breaking application of Deep Neural Networks is in the area of machine vision or the correct classification of pictures or the translation of video into analyzable data. Pictures and video have enormous amounts of information and detail. So much so that it’s very difficult to effectively use it all without heavy automation. That’s the perfect fit for a Deep Learning Neural Network.
Both techniques, as well as their many cousins, have tremendous opportunities to add value if applied to the problems they’re best suited for and conversely, as with any technique, they could also lead to problems if naively applied inappropriately.
If you liked this discussion, I’d appreciate you sharing it or clicking the “like” button. Your vote of approval helps spread the publicity and is always appreciated and useful in the prioritization of further content.
**David Young has worked in Marketing Analytics 20+ years and lives in Vienna, VA**
If you enjoyed this you might enjoy my book:
Book preview and always in stock at the Publisher: [https://store.bookbaby.com//bookshop/book/index.aspx?bookURL=A-Short-Guide-to-Marketing-Model-Alignment-and-Design](https://store.bookbaby.com/bookshop/book/index.aspx?bookURL=A-Short-Guide-to-Marketing-Model-Alignment-and-Design)
Also at Amazon: [https://www.amazon.com/Short-Guide-Marketing-Alignment-Design/dp/1543912591/ref=sr_1_1?ie=UTF8&qid=1510196791&sr=8-1&keywords=a+short+guide+to+marketing+model+alignment+%26+design](https://www.amazon.com/Short-Guide-Marketing-Alignment-Design/dp/1543912591/ref=sr_1_1?ie=UTF8&qid=1510196791&sr=8-1&keywords=a+short+guide+to+marketing+model+alignment+&+design)
Training a deep neural network is a tedious process. More practical approaches includes re-using a trained networks for another task, and using the same network for number of tasks. In this article we discuss two important approaches. Transfer learning and Multi-task learning.
![](https://media.licdn.com/dms/image/C4E12AQGvZRDDAEqp0A/article-inline_image-shrink_1000_1488/0?e=1560384000&v=beta&t=1PtMt1M731GEhMYqns2oTfPnH8MBr4dv9AmEv7cKJz0)
In Transfer learning, we would like to leverage the knowledge learned by a **source**task to help learning another **target**task. For example, a well-trained, rich image classification network could be leveraged for another image target related task. Another example, the knowledge learned by a network trained on simulated environment can be transferred to a network for the real environment. Basically, there are two basic scenarios for neural networks transfer learning: **Feature Extraction** and **Fine Tuning**. A well known example for transfer learning is to load the already trained large scale classification [VGG ](https://arxiv.org/abs/1409.1556)network that is able to classify images into one of 1000 classes, and use it for another task such as classification of special medical images.
**1) Feature Extraction:**
In Feature extraction, a pre-trained network on a source task is used as a feature extractor for another target task by adding a simple classifier on top of the pre-trained network. Only the parameters of the added classifier are updated, while the pre-trained network parameters are frozen. This allows the new task to benefit from features learned from the source task. However, these features are more specialized for the source task.
**2) Fine Tuning:**
Fine tuning allows modification of the pre-trained network parameters to learn the target task. Usually, a new randomly initialized layer is added above the pre-trained network. Parameters of the pre-trained network are updated but using a smaller learning rate to prevent major changes. It is normal to freeze the parameters of the bottom layers, the more generic layers, and only fine-tune some top layers, the more specific layers. Moreover, freezing some layers will reduce the number of trainable parameters and this could help to overcome the overfitting problem, especially when the available data for the target task is not large. Practically, fine tuning outperforms feature extraction as it enables optimizing pre-trained network for the new task.
**Transfer Learning Basic Scenarios:**
Basically, there are four scenarios for transfer learning depending on two main factors; 1) the size of target task dataset, 2) the similarity between the source and target tasks:
* **Case 1**: Target dataset is small and target task is similar to source task: In this case Feature extraction is used, because target dataset is small and training could cause model overfitting.
* **Case 2**: Target dataset is small and target task is different from source task: Here, we fine tune bottom, generic layers and remove higher, specific layers. In other words, we use feature extraction from early stages.
* **Case 3**: Target dataset is large and target task similar to source task: Here, we have large data, we can just train a network from scratch where the parameters are randomly initialized. However it would be better to make use of the pre-trained model to initialize the parameters and fine tune few layers.
* **Case 4**: Target dataset is large and target task is different from source task: Here, we fine tune a large number of layers or even the entire network.
The main goal of multitask learning is to improve performance of a number of tasks simultaneously by optimizing all network parameters using samples from these tasks. For example, we would like to have one network that can classify an input face image as male or female, and at the same time can predict its age. Here we have two related tasks one is a binary classification task and the other is a regression task. It is clear that both tasks are related, and learning one should enhance learning the other.
![](https://media.licdn.com/dms/image/C4E12AQE68ppEyvrIgw/article-inline_image-shrink_1500_2232/0?e=1560384000&v=beta&t=HrwUV3GEJOisyvjQ-EMALf90g2-b_iSeEgiZNdZjnWg)
An example of a simple network design could have a shared part between tasks and tasks specific heads. The shared part learns intermediate representations that are common between tasks that helps the learning of the tasks jointly. On the other hand, the specific heads learn how to use the shared representations for each specific task.
 
>
Transfer Learning and
Multitask learning are two vital approaches for Deep Learning
Regards
Training a deep neural network is a tedious process. More practical approaches includes re-using a trained networks for another task, and using the same network for number of tasks. In this article we discuss two important approaches. Transfer learning and Multi-task learning.
![](https://media.licdn.com/dms/image/C4E12AQGvZRDDAEqp0A/article-inline_image-shrink_1000_1488/0?e=1560384000&v=beta&t=1PtMt1M731GEhMYqns2oTfPnH8MBr4dv9AmEv7cKJz0)
In Transfer learning, we would like to leverage the knowledge learned by a **source**task to help learning another **target**task. For example, a well-trained, rich image classification network could be leveraged for another image target related task. Another example, the knowledge learned by a network trained on simulated environment can be transferred to a network for the real environment. Basically, there are two basic scenarios for neural networks transfer learning: **Feature Extraction** and **Fine Tuning**. A well known example for transfer learning is to load the already trained large scale classification [VGG ](https://arxiv.org/abs/1409.1556)network that is able to classify images into one of 1000 classes, and use it for another task such as classification of special medical images.
**1) Feature Extraction:**
In Feature extraction, a pre-trained network on a source task is used as a feature extractor for another target task by adding a simple classifier on top of the pre-trained network. Only the parameters of the added classifier are updated, while the pre-trained network parameters are frozen. This allows the new task to benefit from features learned from the source task. However, these features are more specialized for the source task.
**2) Fine Tuning:**
Fine tuning allows modification of the pre-trained network parameters to learn the target task. Usually, a new randomly initialized layer is added above the pre-trained network. Parameters of the pre-trained network are updated but using a smaller learning rate to prevent major changes. It is normal to freeze the parameters of the bottom layers, the more generic layers, and only fine-tune some top layers, the more specific layers. Moreover, freezing some layers will reduce the number of trainable parameters and this could help to overcome the overfitting problem, especially when the available data for the target task is not large. Practically, fine tuning outperforms feature extraction as it enables optimizing pre-trained network for the new task.
**Transfer Learning Basic Scenarios:**
Basically, there are four scenarios for transfer learning depending on two main factors; 1) the size of target task dataset, 2) the similarity between the source and target tasks:
* **Case 1**: Target dataset is small and target task is similar to source task: In this case Feature extraction is used, because target dataset is small and training could cause model overfitting.
* **Case 2**: Target dataset is small and target task is different from source task: Here, we fine tune bottom, generic layers and remove higher, specific layers. In other words, we use feature extraction from early stages.
* **Case 3**: Target dataset is large and target task similar to source task: Here, we have large data, we can just train a network from scratch where the parameters are randomly initialized. However it would be better to make use of the pre-trained model to initialize the parameters and fine tune few layers.
* **Case 4**: Target dataset is large and target task is different from source task: Here, we fine tune a large number of layers or even the entire network.
The main goal of multitask learning is to improve performance of a number of tasks simultaneously by optimizing all network parameters using samples from these tasks. For example, we would like to have one network that can classify an input face image as male or female, and at the same time can predict its age. Here we have two related tasks one is a binary classification task and the other is a regression task. It is clear that both tasks are related, and learning one should enhance learning the other.
![](https://media.licdn.com/dms/image/C4E12AQE68ppEyvrIgw/article-inline_image-shrink_1500_2232/0?e=1560384000&v=beta&t=HrwUV3GEJOisyvjQ-EMALf90g2-b_iSeEgiZNdZjnWg)
An example of a simple network design could have a shared part between tasks and tasks specific heads. The shared part learns intermediate representations that are common between tasks that helps the learning of the tasks jointly. On the other hand, the specific heads learn how to use the shared representations for each specific task.
 
>
Transfer Learning and
Multitask learning are two vital approaches for Deep Learning
Regards
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册