Partitioning in Apache Hive

Partitioning in Apache Hive

Hive is a good tool for performing queries on large datasets — especially datasets that require full table scans. But quite often, there are instances in which users need to filter the data on specific column values. And that’s where partitioning comes into play. A partition is nothing but a directory that contains the chunk of data. When we do partitioning, we create a partition for each unique value of the column.

Let’s run a simple example to see what it is. The syntax to create a partition table is:


via Feed

May 16, 2017 at 10:30AM


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s