must create partition during the table definition
A, Single -divided table statement: Create Table Day_table (ID int, Content String) partitioned by (dt String); single partition table, according to the sky zone, there is ID, content, DT three columns in the table structure.
distinguished by DT as a folder
b, dual -partition statement statement: Create table day_hour_table (ID int, content string) partitioned by (DT String, Hour String); double partition tables, according to the day and hourly partitions, DT and HOUR newly added to the table structure to the table structure. Two columns.
First use DT as a folder, and then distinguish between hour subfolders
Add partition table syntax(Tables have been created, adding partitions on the basis of this): Alter table table_name add
partition_spec [ LOCATION ‘location1’ ]
partition_spec [ LOCATION ‘location2’ ] …
ALTER TABLE day_table ADD
PARTITION (dt=’2008-08-08′, hour=’08’)
location ‘/path/pv1.txt’
Delete partition syntax:ALTER TABLE table_name DROP
partition_spec, partition_spec,…
Users can use Alter Table Drop Partition to delete partitions. The metadata and data of the partition will be deleted together. Example: Alter Table Day_hour_table Drop Partition (DT = ‘2008-08-08’, Hour = ’09 ‘);
Data load into the grammar in the partition table:
LOAD DATA [LOCAL] INPATH ‘filepath’ [OVERWRITE] INTO TABLE tablename [PARTITION (partcol1=val1, partcol2=val2 …)]
Example:
Load data inpath ‘/User/pv.txt’ into table day_hour_table partition (dt = ‘2008-08-08’, hour = ’08 ‘); Partition (dt = ‘2010-07-07’); When the data is loaded to the table, it will not change the data. LOAD operation only copies the data to the corresponding position of the Hive table. Automatically create a directory under the table when the data is loaded
Sentence based on partition -based query
:SELECT day_table.* FROM day_table WHERE day_table.dt>= ‘2008-08-08’;
View partition statement:
hive> show partitions day_hour_table;OKdt=2008-08-08/hour=08dt=2008-08-08/hour=09dt=2008-08-09/hour=09