ALTER TABLE
仅仅会修改表的元数据,但是不会对表数据本身作任何修改
有全量表如下:
create external table test.class_info(
class string,
student array<string>,
user_info map<string, int>,
position struct<province:string, city:string, district:string>
)
row format delimited fields terminated by ','
collection items terminated by '_'
map keys terminated by ':'
lines terminated by '\n'
stored as textfile
location '/big-data/test/user_info';
有分区表如下:
create external table test.class_info_partition(
class string,
student array<string>,
user_info map<string, int>,
position struct<province:string, city:string, district:string>
)
partitioned by (date_key string)
row format delimited fields terminated by ','
collection items terminated by '_'
map keys terminated by ':'
lines terminated by '\n'
stored as textfile
location '/big-data/test/user_info_partition';
一、表重命名
语法:
ALTER TABLE ods_table_name RENAME TO new_table_name;
实例:
hive> use test;
OK
Time taken: 0.835 seconds
hive> show tables;
OK
class_info
Time taken: 0.112 seconds, Fetched: 1 row(s)
hive> alter table class_info rename to school_class_info;
OK
Time taken: 0.179 seconds
hive> show tables;
OK
school_class_info
Time taken: 0.028 seconds, Fetched: 1 row(s)
二、增加、修改、删除表分区
命令仅可用于分区表,全量表使用报错如下:
hive> ALTER TABLE school_class_info ADD IF NOT EXISTS PARTITION (class = 'Grade-two-of-junior-High-school') LOCATION '/big-data/test/user_info/Grade-two-of-junior-High-school' ;
1、增加表分区
语法:
正常情况下需要分区不存在,如果为了防止分区存在报错可以加 IF NOT EXISTS
关键字
ALTER TABLE table_name ADD [IF NOT EXISTS] PARTITION (分区字段名 = 'XXX') LOCATION '$PATH'
实例:
> ALTER TABLE class_info_partition ADD IF NOT EXISTS PARTITION (date_key = '2021-01-06') LOCATION '/big-data/test/user_info/2021-01-06' ;
OK
Time taken: 0.74 seconds
结果:
hive> show partitions test.class_info_partition;
OK
date_key=2021-01-06
Time taken: 0.146 seconds, Fetched: 1 row(s)
2、删除表分区
基础数据准备
hive> show partitions test.class_info_partition;
OK
date_key=2021-01-06
Time taken: 0.146 seconds, Fetched: 1 row(s)
hive> ALTER TABLE class_info_partition ADD IF NOT EXISTS PARTITION (date_key = '2021-01-07') LOCATION '/big-data/test/user_info/2021-01-07' ;
OK
Time taken: 0.683 seconds
hive> show partitions test.class_info_partition;
OK
date_key=2021-01-06
date_key=2021-01-07
Time taken: 0.125 seconds, Fetched: 2 row(s)
语法:
ALTER TABLE partition_table_name DROP IF EXISTS PARTITION (分区字段名 = 'XXX');
实例:
ALTER TABLE test.class_info_partition DROP IF EXISTS PARTITION (date_key='2021-01-07');
结果:
hive> ALTER TABLE test.class_info_partition DROP [IF EXISTS] PARTITION (date_key='2021-01-07');
Dropped the partition date_key=2021-01-07
OK
Time taken: 1.106 seconds
hive> show partitions test.class_info_partition;
OK
date_key=2021-01-06
Time taken: 0.121 seconds, Fetched: 1 row(s)
修改表分区
语法:
ALTER TABLE class_info_partition PARTITION (分区字段名 = 'XXX') SET LOCATION '$PATH' ;
实例:
> show partitions test.class_info_partition;
OK
date_key=2021-01-06
Time taken: 0.115 seconds, Fetched: 1 row(s)
hive> ALTER TABLE class_info_partition PARTITION (date_key = '2021-01-06') SET LOCATION '/big-data/test/user_info/2021-01-07' ;
OK
Time taken: 0.433 seconds
结果:
我们查询date_key = '2021-01-06'
数据时发现没有数据,在date_key = '2021-01-07
加载数据后数据重新出现。这个命令不会将数据从旧的路径移走,也不会删除旧的数据。
对于内部表即使是使用
ALTER TABLE table_name ADD PARTITION
语句增加分区,分区内的数据和元数据信息也是会被同时删除,对于外部表分区内的数据不会被删除。
三、增加、修改、删除表字段,修改列名/类型/位置/注释
1、为表添加一个字段或者多个字段
语法:
alter table table_name add columns( 新增字段名 新增字段类型 comment '$新增字段注释');
实例:
# 添加1列
hive> desc school_class_info;
OK
class string
student array<string>
user_info map<string,int>
position struct<province:string,city:string,district:string>
Time taken: 0.079 seconds, Fetched: 4 row(s)
hive> alter table school_class_info add columns(
> user_id bigint comment '用户ID'
> );
OK
Time taken: 0.209 seconds
hive> desc school_class_info;
OK
class string
student array<string>
user_info map<string,int>
position struct<province:string,city:string,district:string>
user_id bigint ??ID
Time taken: 0.056 seconds, Fetched: 5 row(s)
# 添加多列
alter table school_class_info add columns(
name string comment '用户名称',
city string comment '城市',
sex string comment '用户性别',
age string comment '用户年龄',
phone string comment '用户手机',
email string comment '用户邮箱',
unqiue_id string comment '身份证ID'
);
hive> desc school_class_info;
OK
class string
student array<string>
user_info map<string,int>
position struct<province:string,city:string,district:string>
user_id bigint ??ID
name string ????
city string ??
sex string ????
age string ????
phone string ????
email string ????
unqiue_id string ???ID
Time taken: 0.06 seconds, Fetched: 12 row(s)
2、删除列/更新列
语法:
alter table table_name replace columns(
保留字段1 保留字段类型1 comment '保留字段类型1注释',
保留字段2 保留字段类型2 comment '保留字段类型2注释',
保留字段3 保留字段类型3 comment '保留字段类型3注释',
保留字段4 保留字段类型4 comment '保留字段类型4注释',
保留字段5 保留字段类型5 comment '保留字段类型5注释',
保留字段6 保留字段类型6 comment '保留字段类型6注释',
......
)
实例:
# 由于数据脱敏,我们需要删除unqiue_id列
hive> alter table school_class_info replace columns(
> class string comment '',
> student array<string> comment '',
> user_info map<string,int> comment '',
> position struct<province:string,city:string,district:string> comment '',
> user_id bigint comment '用户ID',
> name string comment '用户名称',
> city string comment '城市',
> sex string comment '用户性别',
> age int comment '用户年龄',
> phone string comment '用户手机',
> email string comment '用户邮箱'
> );
OK
Time taken: 0.24 seconds
hive> desc school_class_info;
OK
class string
student array<string>
user_info map<string,int>
position struct<province:string,city:string,district:string>
user_id bigint ??ID
name string ????
city string ??
sex string ????
age string ????
phone string ????
email string ????
Time taken: 0.057 seconds, Fetched: 11 row(s)
# 相当于使用replace重新将表的列给更新替换了
结果:我们发现unqiue_id
字段已经被删除
3、修改列名/类型/位置/注释
语法:
#修改字段名,类型,注释
alter table table_name change [column] 现字段1名 修改后字段1名 修改后段1类型 comment '$修改后注释';
#修改后字段位置
alter table table_name change [column] 现字段1名 修改后字段1名 修改后段1类型 comment '$修改后注释' after 改为在某字段后的某字段名;
实例:
# 将性别(sex)名称改成gender,类型改成int,注释改成“性别”
alter table school_class_info change column sex gender int comment '性别';
# 将age字段类型改为int,并将位置移动到name字段后面
alter table school_class_info change age age int comment '用户年龄' after name;
结果测试后报错如下:
hive> alter table school_class_info change column sex gender int comment '性别';
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. Unable to alter table. The following columns have types incompatible with the existing columns in their respective positions :
gender
更改表中字段的顺序还是建议使用alter table table_name change column