6.2、朴素贝叶斯实例

实例一、朴素贝叶斯对莺尾花进行分类

#1、加载数据
data("iris")

#2、创建测试集和训练集数据
library(caret)
## Loading required package: lattice
## Loading required package: ggplot2
## Warning: package 'ggplot2' was built under R version 3.2.3
set.seed(2005)
index <- createDataPartition(iris$Species, p=0.7, list=F)
train_iris <- iris[index, ]
test_iris <- iris[-index, ]

#3、建模
library(e1071)
model_iris <- naiveBayes(Species~., data=train_iris)

#4、模型评估
summary(model_iris)
##         Length Class  Mode     
## apriori 3      table  numeric  
## tables  4      -none- list     
## levels  3      -none- character
## call    4      -none- call
pred <- predict(model_iris, train_iris, type="class")
mean(pred==train_iris[, 5])
## [1] 0.952381
#5、预测
pred_iris <- predict(model_iris, test_iris, type="class")
mean(pred_iris==test_iris[, 5])
## [1] 1
table(pred_iris, test_iris[, 5])
##             
## pred_iris    setosa versicolor virginica
##   setosa         15          0         0
##   versicolor      0         15         0
##   virginica       0          0        15

实例二、对打网球数据分类并预测

#1、加载数据
data<-read.csv("F:/R/Rworkspace/NB/playingtennis.csv")
str(data)
## 'data.frame':    14 obs. of  6 variables:
##  $ Day        : Factor w/ 14 levels "D1","D10","D11",..: 1 7 8 9 10 11 12 13 14 2 ...
##  $ Outlook    : Factor w/ 3 levels "Overcast","Rain",..: 3 3 1 2 2 2 1 3 3 2 ...
##  $ Temperature: Factor w/ 3 levels "Cool","Hot","Mild": 2 2 2 3 1 1 1 3 1 3 ...
##  $ Humidity   : Factor w/ 2 levels "High","Normal": 1 1 1 1 2 2 2 1 2 2 ...
##  $ Wind       : Factor w/ 2 levels "Strong","Weak": 2 1 2 2 2 1 1 2 2 2 ...
##  $ PlayTennis : Factor w/ 2 levels "No","Yes": 1 1 2 2 2 1 2 1 2 2 ...
summary(data)
##       Day        Outlook  Temperature   Humidity     Wind   PlayTennis
##  D1     :1   Overcast:4   Cool:4      High  :7   Strong:6   No :5     
##  D10    :1   Rain    :5   Hot :4      Normal:7   Weak  :8   Yes:9     
##  D11    :1   Sunny   :5   Mild:6                                      
##  D12    :1                                                            
##  D13    :1                                                            
##  D14    :1                                                            
##  (Other):8
#从上可知:数据集中的Day属性对分类和预测无用,可以删除

#2、数据清洗
dataset <- data[, 2:6]

#3、建模
library(e1071)
model <- naiveBayes(dataset[, 1:4], dataset[, 5])

#4、预测
new_data <- data.frame("Rain","Hot","High","Strong")
predict(model, new_data)
## [1] Yes
## Levels: No Yes
new_data <- data.frame("Sunny","Mild","Normal","Weak")
predict(model, new_data)
## [1] Yes
## Levels: No Yes
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值