We can start with simple one – these are general stuffs to check while writing HQL.
General HIVE Query:
- Avoid providing inline queries in the where clause.
- Make all the joins explicitly in the “From” clause instead of “where” clause.
- FULL OUTER JOINS are very expensive in HIVE.
- If possible generate statistics for each table before running the query.
- Make the order of join from smaller to larger tables.
- Any partitioned tables must have the partition range filtered before making any joins.
- While joining smaller tables, of up to 1 GB data size, make a MAPJOIN hint for the smaller table.
- While doing any join if we want to have certain filter as part of the join then it has to be mention as part of the from clause join statement instead of having the same on where clause.
- While Joining Make use of the Key columns to make the Joins.
- If a query is having multiple layers of inner queries and used in many other queries to arrive at a result, please create a physical table which can be refreshed and queried as and when needed.