Case Scenario: Partitioning¶
Single Partitioning¶
The partitioning in any iceberg table is column based. Currently, Flare currently supports only these Partition Transforms identity, year, month, day, and hour.
Multiple Partitioning¶
Partitioning can be done on multiple levels. For e.g, a user wants to partition the city data into two partitions, first based on state_code
and second based on month
. The command will be as follows:
dataos-ctl dataset -a dataos://icebase:retail/city \
-p "identity:state_code" \
-p "month:ts_city:month_partition"
Partition Updation¶
For updating partition, use the below command.
Command
dataos-ctl dataset -a ${{udl}} update-partition \
-p "${{partition_type}}:${{column_name}}:${{partition_name}}"
Example
Letβs say we wanna update the partition of city data along the month
using the timestamp in the ts_city
column, the code will be as follows -
dataos-ctl dataset -a dataos://icebase:retail/city update-partition \
-p "month:ts_city:month_partition"
Output