提交 6d5c0933 编写于 作者: W wizardforcel

2021-01-16 21:36:45

上级 95bea221
......@@ -6,7 +6,7 @@
### 解决方案
1. Import the required libraries, including pandas, for importing a CSV file:
1. 导入所需的库,包括pandas,用于导入CSV文件。
```py
import pandas as pd
......@@ -15,39 +15,39 @@
import matplotlib.pyplot as plt
```
2. Read the CSV file containing the dataset:
2. 读取包含数据集的CSV文件。
```py
data = pd.read_csv("SomervilleHappinessSurvey2015.csv")
```
3. Separate the input features from the target. Note that the target is located in the first column of the CSV file. Convert the values into tensors, making sure the values are converted into floats:
3. 将输入特征与目标分开。注意,目标位于CSV文件的第一列。将值转换为张量,确保值转换为浮点数。
```py
x = torch.tensor(data.iloc[:,1:].values).float()
y = torch.tensor(data.iloc[:,:1].values).float()
```
4. Define the architecture of the model and store it in a variable named **model**. Remember to create a single-layer model:
4. 定义模型的架构,并将其存储在一个名为`model`的变量中。记住要创建一个单层模型。
```py
model = nn.Sequential(nn.Linear(6, 1),
                      nn.Sigmoid())
```
5. Define the loss function to be used. Use the MSE loss function:
5. 定义要使用的损失函数。使用MSE损失函数。
```py
loss_function = torch.nn.MSELoss()
```
6. Define the optimizer of your model. Use the Adam optimizer and a learning rate of`0.01`:
6. 定义你模型的优化器。使用亚当优化器和学习率`0.01`
```py
optimizer = torch.optim.Adam(model.parameters(), lr=0.01)
```
7. Run the optimization for 100 iterations. Every 10 iterations, print and save the loss value:
7. 运行优化100次迭代。每迭代10次,打印并保存损失值。
```py
losses = []
......@@ -64,7 +64,7 @@
最终损失应约为`0.24`。
8. Make a line plot to display the loss value for each iteration step:
8. 做一个线图来显示每个迭代步骤的损失值。
```py
plt.plot(range(0,100), losses)
......@@ -91,13 +91,13 @@
### 解决方案
1. Import the required libraries:
1. 导入所需的库。
```py
import pandas as pd
```
2. Using pandas, load the **.csv** file:
2. 使用pandas,加载`.csv`文件。
```py
data = pd.read_csv("YearPredictionMSD.csv", nrows=50000)
......@@ -114,7 +114,7 @@
图 2.33:YearPredictionMSD.csv
3. Verify whether any qualitative data is present in the dataset:
3. 核实数据集中是否存在任何定性数据。
```py
cols = data.columns
......@@ -124,7 +124,7 @@
输出应为空列表,这意味着没有定性特征。
4. Check for missing values.
4. 检查是否有缺失值。
如果在先前用于此目的的代码行中添加一个附加的`sum()`函数,则将获得整个数据集中的缺失值之和,而无需按列进行区分:
......@@ -134,7 +134,7 @@
输出应为`0`,这意味着所有要素均不包含缺失值。
5. Check for outliers:
5. 检查是否有异常值。
```py
outliers = {}
......@@ -154,14 +154,14 @@
输出字典应显示所有要素均不包含代表超过 5% 数据的离群值。
6. Separate the features from the target data:
6. 将特征从目标数据中分离出来。
```py
X = data.iloc[:, 1:]
Y = data.iloc[:, 0]
```
7. Rescale the features data using the standardization methodology:
7. 使用标准化方法对特征数据进行重新标定。
```py
X = (X - X.mean())/X.std()
......@@ -174,7 +174,7 @@
图 2.34:重新缩放的要素数据
8. Split the data into three sets: training, validation, and test. Use the approach of your preference:
8. 将数据分成三组:训练、验证和测试。使用你喜欢的方法。
```py
from sklearn.model_selection import train_test_split
......@@ -193,7 +193,7 @@
                                  random_state=0)
```
9. Print the resulting shapes as follows:
9. 打印所得形状如下。
```py
print(x_train.shape, y_train.shape)
......@@ -219,13 +219,13 @@
### 解决方案
1. Import the required libraries:
1. 导入所需的库。
进口火炬
将 torch.nn 导入为 nn
2. Split the features from the targets for all three sets of data that we created in the previous activity. Convert the DataFrames into tensors:
2. 将我们在上一个活动中创建的所有三组数据的特征从目标中分割出来。将DataFrames转换为张量。
x_train = torch.tensor(x_train.values).float()
......@@ -239,7 +239,7 @@
y_test = torch.tensor(y_test.values).float()
3. Define the architecture of the network. Feel free to try different combinations for the number of layers and the number of units per layer:
3. 定义网络的架构。可以自由尝试不同的层数和每层单元数的组合。
模型= nn.Sequential(nn.Linear(x_train.shape [1],10),\
......@@ -255,13 +255,13 @@
nn.Linear(5,1))
4. Define the loss function and the optimizer algorithm:
4. 定义损失函数和优化器算法。
loss_function = torch.nn.MSELoss()
优化程序= torch.optim.Adam(model.parameters(),lr = 0.01)
5. Use a **for** loop to train the network for 3,000 iteration steps:
5. 使用`for`循环来训练网络,迭代步数为3000步。
对于我的范围(3000):
......@@ -279,7 +279,7 @@
打印(i,loss.item())
6. Test your model by performing a prediction on the first instance of the test set and comparing it with the ground truth:
6. 通过对测试集的第一个实例进行预测,并与地面真相进行比较来测试你的模型。
之前=模型(x_test [0])
......@@ -303,7 +303,7 @@
解:
1. Import the following libraries:
1. 导入以下库:
将熊猫作为 pd 导入
......@@ -325,7 +325,7 @@
torch.manual_seed(0)
2. Read the previously prepared dataset, which should have been named **dccc_prepared.csv**:
2. 读取之前准备好的数据集,该数据集应该命名为`dccc_prepared.csv`
数据= pd.read_csv(“ dccc_prepared.csv”)
......@@ -337,13 +337,13 @@
图 3.14:dccc_prepared.csv
3. Separate the features from the target:
3. 将特征与目标分开。
X = data.loc [:,:-1]
y =数据[“下个月的默认付款”]
4. Using scikit-learn's **train_test_split** function, split the dataset into training, validation, and testing sets. Use a 60:20:20 split ratio. Set **random_state** to 0:
4. 使用scikit-learn的`train_test_split`函数,将数据集分割成训练集、验证集和测试集。使用60:20:20的分割比例。将`random_state`设置为0。
X_new,X_test,\
......@@ -377,7 +377,7 @@
测试集:(9346,22)(9346,)
5. Convert the validation and testing sets into tensors, bearing in mind that the features' matrices should be of the float type, while the target matrices should not. Leave the training sets unconverted for the moment as they will undergo further transformation:
5. 将验证集和测试集转换为张量,记住特征矩阵应该是`float`类型,而目标矩阵不应该。训练集暂不转换,因为它们将进行进一步的转换。
X_dev_torch = torch.tensor(X_dev.values).float()
......@@ -967,7 +967,7 @@
### 解决方案
1. Import the required libraries:
1. 导入所需的库。
将 numpy 导入为 np
......@@ -1633,7 +1633,7 @@
### 解决方案
1. Import the required libraries:
1. 导入所需的库。
将 numpy 导入为 np
......@@ -2595,7 +2595,7 @@
### 解决方案
1. Import the required libraries:
1. 导入所需的库。
将熊猫作为 pd 导入
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册