文章目录
前言
利用CIFAR-10数据集,基于keras,练习构建简单的CNN网络和选VGG16做为baseline的迁移学习
(又名:放假了了把做过的project整理一下,复习一遍课程
一、介绍Keras的中文版的文档,原文链接:
https://keras-cn.readthedocs.io/en/latest/getting_started/sequential_model/
二、keras 构建模型的流程
先指定模型 Sequential( ) ---->堆叠模块 .add( ) ---->编译模型 .compile( ) ---->在训练数据上迭代 .fit( ) ---->评估 .evaluate( ) ---->对新数据的预测 .predict( )
三、常用层
常用层对应于core模块,core内部定义了一系列常用的网络层,包括全连接层、激活层等。
Dense层(全连接层):
所实现的运算是output = activation(dot(input, kernel)+bias)。其中activation是逐元素计算的激活函数,kernel是本层的权值矩阵,bias为偏置向量,只有当use_bias=True才会添加。
如果本层的输入数据的维度大于2,则会先被压为与kernel相匹配的大小。
Dropout层:
Dropout将在训练过程中每次更新参数时按一定概率(rate)随机断开输入神经元,Dropout层用于防止过拟合。
Flatten层:
用来将输入“压平”,即把多维的输入一维化,常用在从卷积层到全连接层的过渡。Flatten不影响和batch的大小。
reshape层
Permute层:当需要将RNN和CNN网络连接的时候,可能会用到该层。
RepeatVector层
repeatvector层将输入重复n次
四、实验
Task:
(1) train and test a CNN model on GPU without transfer learning;
(2) train and test a CNN model on GPU with transfer learning.
Dataset:
CIFAR-10 dataset, which is preinstalled with Tensorflow.
https://keras.io/api/datasets/
The CIFAR-10 dataset consists of 60000 32x32 color images in 10 classes, with 6000 images per
class. There are 50000 training images and 10000 test images.
1.Helpful Functions for Tensorflow (Little Gems)
The following functions will be used with TensorFlow to help preprocess the data. They allow you to build the feature vector for a neural network.
Predictors/Inputs
Fill any missing inputs with the median for that column. Use missing_median.
Encode textual/categorical values with encode_text_dummy.
Encode numeric values with encode_numeric_zscore.
Output
Discard rows with missing outputs.
Encode textual/categorical values with encode_text_index.
Do not encode output numeric values.
Produce final feature vectors (x) and expected output (y) with to_xy.
from collections.abc import Sequence
from sklearn import preprocessing
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import shutil
import os
# Encode text values to dummy variables(i.e. [1,0,0],[0,1,0],[0,0,1] for red,green,blue)
def encode_text_dummy(df, name):
dummies = pd.get_dummies(df[name])
for x in dummies.columns:
dummy_name = "{}-{}".format(name, x)
df[dummy_name] = dummies[x]
df.drop(name, axis=1, inplace=True)
# Encode text values to indexes(i.e. [1],[2],[3] for red,green,blue).
def encode_text_index(df, name):
le = preprocessing.LabelEncoder()
df[name] = le.fit_transform(df[name])
return le.classes_
# Encode a numeric column as zscores
def encode_numeric_zscore(df, name, mean=None, sd=None):
if mean is None:
mean = df[name].mean()
if sd is None:
sd = df[name].std()
df[name] = (df[name] - mean) / sd
# Convert all missing values in the specified column to the median
def missing_median(df, name):
med = df[name].median()
df[name] = df[name].fillna(med)
# Convert all missing values in the specified column to the default
def missing_default(df, name, default_value):
df[name] = df[name].fillna(default_value)
# Convert a Pandas dataframe to the x,y inputs that TensorFlow needs
def to_xy(df, target):
result = []
for x in df.columns:
if x != target:
result.append(x)
# find out the type of the target column.
target_type = df[target].dtypes
target_type = target_type[0] if isinstance(target_type, Sequence) else target_type
# Encode to int for classification, float otherwise. TensorFlow likes 32 bits.
if target_type in (np.int64, np.int32):
# Classification
dummies = pd.get_dummies(df[target])
return df[result].values.astype(np.float32), dummies.values.astype(np.float32)
else:
# Regression
return df[result].values.astype(np.float32), df[target].values.astype(np.float32)
# Nicely formatted time string
def hms_string(sec_elapsed):
h