Connecting to a Custom Hive Metastore

This section covers the following topics:

Creating a Custom Hive Metastore describes how to create a custom Hive metastore from the very beginning.

You can configure QDS access to the metastore either through a Bastion node, or by whitelisting a private IP address.

  • To configure a Bastion node for QDS access, follow these instructions; or
  • To whitelist a private IP address, contact Qubole Support and provide the address to be whitelisted.

Specifying Your Configuration in the QDS UI

  1. From the QDS main menu, choose Explore.
  2. On the resulting page, pull down the menu to the right of Qubole Hive and choose Connect Custom Metastore.
  3. Fill out the fields as follows:
    • Metastore Database Type: MySQL is the only database that is supported.
    • Metastore Database Name: provide the name of the MySQL database hosting the metastore.
    • Host Address:
      • If you are using a Bastion node, enter the Bastion node’s private IP address.
      • If you are whitelisting an address, enter the public IP address corresponding to the private IP address that you provided to Qubole support.
    • Port: Accept the default (3306).
    • User Name: Enter the name of the superuser or administrator user on the host you identified in the Host Address field.
    • Password: Enter the password for the superuser or administrator user.
    • Enable Cluster Access: Check this box to allow direct access between the QDS cluster and the metastore. Qubole recommends that you use direct access.
  4. If you are not using a Bastion node, leave Bastion Node unchecked and click Save to save your changes; otherwise continue with step 5.
  5. Check the box next to Bastion Node to enable access via your Bastion node.
  6. Enter the public IP address or hostname of the Bastion node.
  7. Enter the user name of the superuser or administrative user on the Bastion node.
  8. Enter the private key corresponding to the Bastion node’s public key.

Configuring Thrift Metastore Server Interface for the Custom Metastore

HiveServer2 (HS2) and other processes communicate with the metastore using the Hive Metastore Service through the thrift interface. Once HMS is started on a port, HS2, Presto and Spark can be configured with it to talk to the metastore. It must be configured as: thrift://<host>:<port>. The port that is used for Hive Metastore Service in Qubole Hive is 10000.

You can configure a thrift metastore server interface to access Hive metadata from the custom metastore in Hive, Presto, and Spark engines as mentioned below:

  • As a Hive bootstrap, set hive.metastore.uris=thrift://<URI>:10000;.
  • As a Presto cluster override, set hive.metastore.uri=thrift://<URI>:10000.
  • As a Spark cluster override, set spark.hadoop.hive.metastore.uris=thrift://<URI>:10000.

Note

Qubole supports configuring the thrift socket connection timeout according to the required value based on the schema table count. To configure the thrift socket connection timeout, create a Qubole support ticket.