Hive General Interview Questions
  1. What kind of data warehouse application is suitable for Hive? What are the types of tables in Hive?
  2. Explain the SMB Join in Hive?
  3. How HIVE is different from RDBMS?
  4. What are the types of database does Hive support ?
  5. In Hive, how can you enable buckets?
  6. Is Hive suitable to be used for OLTP systems? Why?
  7. What is the Object Inspector functionality is in Hive?
  8. What are limitations of Hive?
  9. What are the different Modes in the Hive?
  10. What is Hive Bucketing?
  11. What is the difference between partition and bucketing?
  12. Where does the data of a Hive table gets stored?
  13. How data transfer happens from HDFS to Hive?
  14. What does the Hive query processor do?
  15. Explain about SORT BY, ORDER BY, DISTRIBUTE BY and CLUSTER BY in Hive.
  16. What is the difference between local and remote metastore?
  17. Which classes are used in Hive to Read and Write HDFS Files?
  18. Explain the functionality of ObjectInspector.
  19. What is ObjectInspector functionality in Hive?
  20. How does bucketing help in the faster execution of queries?
  21. Why will mapreduce not run if you run select * from table in hive?
  22. What is Hive MetaStore?
  23. What are the three different modes in which hive can be run?
  24. How can you prevent a large job from running for a long time?
  25. When do we use explode in Hive?
  26. What are the different components of a Hive architecture?
  27. How can you connect an application, if you run Hive as a server?
  28. Can we LOAD data into a view?
  29. Is it possible to add 100 nodes when we already have 100 nodes in Hive? If yes, how?
  30. Can Hive process any type of data formats?
  31. How can you stop a partition form being queried?
  32. What is a Hive variable? What do we use it for?
  33. What is SerDe in Apache Hive?
  34. Whenever we run a Hive query, a new metastore_db is created. Why?
  35. Can we change the data type of a column in a hive table?
  36. Why does Hive not store metadata information in HDFS?
  37. How does Hive deserialize and serialize the data?
  38. What is RegexSerDe?
  39. While loading data into a hive table using the LOAD DATA clause, how do you specify it is a hdfs file and not a local file ?
  40. Explain about the different types of partitioning in Hive?
  41. What is the significance of ‘IF EXISTS” clause while dropping a table?
  42. How can Hive avoid mapreduce?
  43. What is the relationship between MapReduce and Hive? or How Mapreduce jobs submits on the cluster?
  44. What is ObjectInspector functionality?
  45. Suppose that I want to monitor all the open and aborted transactions in the system along with the transaction id and the transaction state. Can this be achieved using Apache Hive?
  46. Can a partition be archived? What are the advantages and disadvantages?
  47. does the archiving of Hive tables save space in HDFS?
  48. does Hive support record level Insert, delete or update?
  49. What are the default record and field delimiter used for hive text files?
  50. What is difference between static and dynamic partition of a table?
  51. Why do we perform partitioning in Hive?
  52. How does partitioning help in the faster execution of queries?
  53. Can you list few commonly used Hive services?
  54. What is the default maximum dynamic partition that can be created by a mapper/reducer? How can you change it?
  55. Why do we need buckets?
  56. Can we name view the same as the name of a Hive table?
  57. What Options are Available When It Comes to Attaching Applications to the Hive Server?
  58. When should we use SORT BY instead of ORDER BY?
  59. What are the uses of Hive Explode?
  60. Can we run UNIX shell commands from Hive? Can Hive queries be executed from script files? If yes, how?
  61. How is ORC file format optimised for data storage and analysis?
  62. What is the difference between Internal and External Table?
  63. Explain the different types of join in Hive.
  64. What is a metastore in Hive?
  65. What is the functionality of Query Processor in Apache Hive?
  66. What is the utilization of Hcatalog?
  67. How will you optimize Hive performance?
  68. In case of embedded Hive, can the same metastore be used by multiple users?
  69. When to use Map reduce mode?
  70. What is the importance of Thrift server & client, JDBC and ODBC driver in Hive?
Hive MCQ Quiz Interview Questions
  1. The property set to run hive in local mode as true so that it runs without creating a mapreduce job is
  2. When a partition is archived in Hive it
  3. A user creates a UDF which accepts arguments of different data types, each time it is run. It is an example of
  4. While querying a hive table for a Array type column, if the array index is nonexistent then
  5. A GenericUDF is a Function that
  6. Which of the following scenarios are not prevented by enabling strict mode in Hive?
  7. In hive, what happens when the schema does not match the file content?
  8. The DISTRIBUTED BY clause in hive
  9. In ______ mode HiveServer2 only accepts valid Thrift calls.
  10. The disadvantage of compressing files in HDFS is
  11. The partitioning of a table in Hive creates more
  12. The Property that decides what is the maximum number of files that can be sampled during the use of the LIMIT clause is
  13. For optimizing join of three tables, the largest sized tables should be placed as
  14. The drawback of managed tables in hive is
  15. Which of the following command sets the value of a particular configuration variable (key)?
  16. The below expression in the where clause RLIKE '.*(Chicago|Ontario).*'; gives the result which match
  17. What is the disadvantage of using too many partitions in Hive tables?
  18. By default when a database is dropped in Hive:
  19. Explode in Hive is used to convert complex data types into desired table formats.
  20. Point out the correct statement.
  21. Point out the correct statement
  22. Point out the wrong statement:
  23. Hive converts queries to all except
  24. The thrift service component in hive is used for