序言
默认我们搭建起来的Hive或者SparkSql都是不支持事务的.需要相应的配置才能开启Hive的事务.
同时Hive的Delete和Update也是需要先开启ACID才能支持的cuiyaonan2000@163.com.
参考地址:
- LanguageManual DML - Apache Hive - Apache Software Foundation
- Hive Transactions - Apache Hive - Apache Software Foundation
还是针对hive-site.xml进项额外的配置即可
hive.support.concurrency
true
hive.enforce.bucketing
true
hive.exec.dynamic.partition.mode
nonstrict
hive.txn.manager
org.apache.hadoop.hive.ql.lockmgr.DbTxnManager
hive.compactor.initiator.on
true
hive.compactor.worker.threads
1
数据库
如果你是在已经初始化的metastore上启用ACID则需要执行hive目录/soft/hadoop/apache-hive-3.1.2-bin/scripts/metastore/upgrade/mysql 下找到如下的文件在mysql的数据库中执行cuiyaonan2000@163.com
如果是还没初始化数据库则,在配置完hive-site.xml后,直接执行初始化数据库就行了.命令如下:
schematool -dbType mysql -initSchema
Table的相关设置
通过官网和网友们的翻译我们了解到table必须满足如下的要求才能支持事务和delete与update.
翻译如下
- 表的存储格式必须是ORC(STORED AS ORC);
- 表必须进行分桶(CLUSTERED BY (col_name, col_name, ...) INTO num_buckets BUCKETS);
- Table property中参数transactional必须设定为True(tblproperties('transactional'='true'));
官网的示例:
CREATE TABLE table_name (
id int,
name string
)
CLUSTERED BY (id) INTO 2 BUCKETS STORED AS ORC
TBLPROPERTIES ("transactional"="true",
"compactor.mapreduce.map.memory.mb"="2048", -- specify compaction map job properties
"compactorthreshold.hive.compactor.delta.num.threshold"="4", -- trigger minor compaction if there are more than 4 delta directories
"compactorthreshold.hive.compactor.delta.pct.threshold"="0.5" -- trigger major compaction if the ratio of size of delta files to
-- size of base files is greater than 50%
);