Managing Clusters

QDS pre-configures a set of clusters to run the jobs and queries you initiate from the Workbench page.

You can use the QDS UI to modify these clusters, and create and change new ones. This page provides detailed instructions.

QDS starts clusters when they are needed and resizes them according to the query workload.

Note

  • By default, QDS shuts down a cluster down after two hours of inactivity (that is, when no command has been running during the past two hours). This default is configurable; set Idle Cluster Timeout as explained below.
  • A cluster lifespan depends on a session in a user account; the cluster runs as long as there is an active cluster session. By default, a session stays alive for two hours of inactivity and can run for any amount of time as long as commands are running. Use the Sessions tab in Control Panel to create new sessions, terminate sessions, or extend sessions. For more information, see Managing Sessions.

Other Useful Pages

To Get Started

  • To see what clusters are active, and for information about each cluster, navigate to the Clusters page.
  • To add a new cluster, click New on the Clusters page; to modify an existing cluster, click the Edit button next to that that cluster.

Now follow instructions under Modifying Cluster Settings for GCP below.

Modifying Cluster Settings for GCP

Under the Configuration tab, you can add or modify:

  • Cluster Labels: A cluster can have one or more labels separated by a commas. You can make a cluster the default cluster by including the label default.

  • Hive, Spark, or Presto Version: For the engine type you have chosen, choose a supported version from the drop-down list.

  • Node Types: Choose the Google Cloud machine types for your QDS cluster nodes. These are virtual machines (VMs), also known as instances.

    • Coordinator Node Type: You can change the Coordinator node type from the default by selecting a different type from the drop-down list.
    • Worker Node Type: You can change the worker node type from the default by selecting a different type from the drop-down list.
  • Use Multiple Worker Node Types: - (Hadoop 2 and Spark clusters) See Configuring Heterogeneous Worker Nodes in the QDS UI.

  • Minimum Worker Nodes: Enter the minimum number of worker nodes if you want to change it (the default is 1). See Autoscaling in Qubole Clusters for information about how QDS uses the minimum and maximum node count to keep your cluster running at maximum efficiency.

  • Maximum Worker Nodes: Enter the maximum number of worker nodes if you want to change it (the default is 1).

  • Disk Storage Settings: The next five configuration settings determine whether your cluster uses local SSD storage disks

    or persistent disks (either standard storage or SSD) or a combination of the two.

    • Local Disk Count (375GB SSD): Choose any number up to 8 of these per GCP project. The limit of 8 is because there is a limit of total local SSD storage per GCP project. Storage limits may vary between accounts, so check your GCP account details to see how many local SSD disks you can have.
    • Persistent Disk Volume Count: Google Cloud persistent disks provide storage for HDFS and other data created and used by jobs in progress. The default for this field is zero; change it if you want to add local storage in addition to the small boot disk Google Cloud provides by default. For more information, see Google Cloud persistent disks.
    • Persistent Disk Type: You can choose SSDs (solid state disks) or standard disks.
    • Persistent Disk Size: Enter the size in gigabytes (GB) of each persistent volume to be added.
    • Enable Persistent Disk Upscaling (Hive and Spark only): Check this box if you are adding persistent disks and want to allow QDS to increase disk storage dynamically if capacity is running low.
  • Node Bootstrap File: You can append the name of a node bootstrap script to the default path.

  • Disable Automatic Cluster Termination: Check this box if you do not want QDS to terminate idle clusters automatically. Qubole recommends that you leave this box unchecked.

  • Idle Cluster Timeout: Optionally specify how long (in hours) QDS should wait to terminate an idle cluster. The default is 2 hours; to change it, enter a number between 0 and 6. This will override the timeout set at the account level.

Note

See aggressive-downscaling-azure for information about an additional set of capabilities that currently require a Qubole Support ticket.

Under the Composition tab you can add or modify:

  • Coordinator and Minimum Worker Nodes: Choose preemptible or standard (non-preemptible) instances for the core nodes in your QDS cluster.
  • Autoscaling Worker Nodes: Choose preemptible or standard (non-preemptible) instances for the non-core nodes in your QDS cluster. These are nodes that QDS adds or removes depending on the workload.
  • Preemptible Nodes (%): The percentage of non-core nodes that are to be preemptible. The remainder will be standard GCP instances.
  • Fallback to On-demand Nodes: If you check this box, and QDS is unable to upscale your cluster with as many preemptible instances as you have specified, QDS will make up the shortfall with standard (non-preemptible) nodes. See Autoscaling in Qubole Clusters for a full discussion.
  • Use Qubole Placement Policy: If this box is checked, QDS will make a best effort to place one copy of each data block on a stable (non-preemptible) node. Qubole recommends that you leave this box checked.
  • Cool-down Period: Choose how long, in minutes, QDS should wait before terminating an idle node. See Cool-Down Period.

Under the Advanced Configuration tab you can add or modify:

  • Region: Click on the drop-down list to choose the geographical location.
  • Zone: Click on the drop-down list to choose the zone within the geographical location.
  • Network: Choose a GCP network from the drop-down menu or accept the default.
  • Subnetwork: Choose an IPv4 address from the drop-down menu or accept the default.
  • Coordinator Static IP: Optionally provide a static IP address for the cluster Coordinator node.
  • Bastion Node: Optionally provide the public IP address of a Bastion node to be used for access to private subnets.
  • Custom Tags: You can create tags to be applied to the GCP virtual machines.
  • Override Hadoop Configuration Variables: For a Hadoop (Hive) cluster, enter Hadoop variables here if you want to override the defaults (Recommended Configuration) that Qubole uses. See also Advanced Configuration: Modifying Hadoop Cluster Settings.
  • Fair Scheduler Configuration: For a Hadoop or Spark cluster, enter Hadoop Fair Scheduler values if you want to override the defaults that Qubole uses.
  • Default Fair Scheduler Queue: Specify the default Fair Scheduler queue (used if no queue is specified when the job is submitted).
  • Override Spark Configuration: For a Spark cluster, enter Spark settings here if you want to override the defaults (Recommended Configuration) that Qubole uses. See also Advanced Configuration: Modifying Hadoop Cluster Settings.
  • Python and R Version: For a Spark cluster, choose the versions from the drop-down menu.
  • HIVE SETTINGS: See Configuring a HiveServer2 Cluster.
  • MONITORING: See Advanced configuration: Modifying Cluster Monitoring Settings.
  • Customer SSH Public Key: The public key from an SSH public-private key pair, used to log in to QDS cluster nodes.

Qubole Public Key cannot be changed. QDS uses this key to gain access to the cluster and run commands on it.

Note

You can improve the security of a cluster by authorizing Qubole to generate a unique SSH key every time a cluster is started; create a ticket with Qubole Support to enable this capability. A unique SSH key is generated by default in the https://in.qubole.com , https://us.qubole.com, and https://wellness.qubole.com environments.

Once the SSH key is enabled, QDS uses the unique SSH key to interact with the cluster. If you want to use this capability to control QDS access to a Bastion node communicating with a cluster running on a private subnet, Qubole support provides you with an account-level key that you must authorize on the Bastion node. To get the account-level key, use this API or navigate to the cluster’s Advanced Configuration, the account-level SSH key is displayed in EC2 Settings as described in modify-ec2-settings.

When you are satisfied with your changes, click Create or Update.

Advanced Configuration: Modifying Hadoop Cluster Settings

Under HADOOP CLUSTER SETTINGS, you can:

  • Specify the Default Fair Scheduler Queue if the queue is not submitted during job submission.
  • Override Hadoop Configuration Variables for the Worker Node Type specified in the Cluster Settings section. The settings shown in the Recommended Configuration field are used unless you override them.
  • Set Fair Scheduler Configuration values to override the default values.

Hadoop-specific Options provides more description about the options.

Note

Recommissioning can be enabled on Hadoop clusters as an Override Hadoop Configuration Variable. See Enable Recommissioning for more information.

See Enabling Container Packing in Hadoop 2 and Spark for more information on how to more effectively downscale in an Hadoop 2 cluster.

If the cluster type is Spark, Spark configuration is set in the Recommended Spark Configuration, which is in addition to Hadoop Cluster Settings as described in Configuring a Spark Cluster.

Advanced configuration: Modifying Cluster Monitoring Settings

Under MONITORING:

  • Enable Ganglia Monitoring: Ganglia monitoring is enabled automatically if you use Datadog; otherwise it is disabled by default. For more information on Ganglia monitoring, see Performance Monitoring with Ganglia.

Applying Your Changes

Under each tab on the Clusters page, there a right pane (Summary) from which you can:

  • Review and edit your changes.
  • Click Create to create a new cluster.
  • Click Update to change an existing cluster’s configuration.

If you are not satisfied with your changes:

  • Click Previous to go back the previous tab.
  • Click Cancel to leave settings unchanged.