Hive Optimization

Hive

Configuration Properties

More Hive Configuration Properties

dynamic partition
hive.exec.dynamic.partition
Default Value: false prior to Hive 0.9.0; true in Hive 0.9.0 and later (HIVE-2835)
Added In: Hive 0.6.0
Whether or not to allow dynamic partitions in DML/DDL.
set hive.exec.dynamic.partition = true
hive.exec.dynamic.partition.mode
Default Value: strict
Added In: Hive 0.6.0
In strict mode, the user must specify at least one static partition in case the user accidentally overwrites all partitions. In nonstrict mode all partitions are allowed to be dynamic.
Set to nonstrict to support INSERT ... VALUES, UPDATE, and DELETE transactions (Hive 0.14.0 and later). For a complete list of parameters required for turning on Hive transactions, see hive.txn.manager.
set hive.exec.dynamic.partition.mode = nonstrict
hive.exec.max.dynamic.partitions
Default Value: 1000
Added In: Hive 0.6.0
Maximum number of dynamic partitions allowed to be created in total.
set hive.exec.max.dynamic.partitions = 1000
hive.exec.max.dynamic.partitions.pernode
Default Value: 100
Added In: Hive 0.6.0
Maximum number of dynamic partitions allowed to be created in each mapper/reducer node.
set hive.exec.max.dynamic.partitions.pernode = 100
parallel
hive.exec.parallel
Default Value: false
Added In: Hive 0.5.0
Whether to execute jobs in parallel.  Applies to MapReduce jobs that can run in parallel, for example jobs processing different source tables before a join.  As of Hive 0.14, also applies to move tasks that can run in parallel, for example moving files to insert targets during multi-insert.
set hive.exec.parallel = true
hive.exec.parallel.thread.number
Default Value: 8
Added In: Hive 0.6.0
How many jobs at most can be executed in parallel.
set hive.exec.parallel.thread.number = 12
merge
hive.merge.mapfiles
Default Value: true
Added In: Hive 0.4.0
Merge small files at the end of a map-only job.
set hive.merge.mapfiles = true
hive.merge.mapredfiles
Default Value: false
Added In: Hive 0.4.0
Merge small files at the end of a map-reduce job.
set hive.merge.mapredfiles = true
optimize
hive.optimize.groupby
Default Value: true
Added In: Hive 0.5.0
Whether to enable the bucketed group by from bucketed partitions/tables.
set hive.optimize.groupby = true

Deprecated Properties

See Hadoop Deprecated Properties

Design

Program

pruning
join
count distinct
group by
(not) in/exists
multi insert、union all

Other

Hadoop

Core

HDFS

YARN

MapReduce

Spark

Server & OS