Globally Rolled Out Features/EnhancementsΒΆ
Qubole enables certain features/enhancements as part of its gradual rollout program to different pods over a period of time. After Qubole rolls out such features/enhancements, it globally enables them on the Qubole platform.
The following table provides the list of features that are globally rolled out for you to use.
Feature/Enhancement | Feature Description | Qubole Component | Supported Cloud Provider | QDS Release Version |
---|---|---|---|---|
Auto-population of instances similar to primary worker node type | Enhancement in the heterogeneous cluster configuration UI that suggests instances similar to the chosen worker node type but from different generations instead of suggesting the instance of double weight of the same generation. | Cluster Management | AWS | R58 Quick Fix |
Optimized version of the Beeline script | It is an optimization that reduces the latency of HiveServer2 queries. | Hive | AWS | R58 |
Hive MapJoin Counters Computation | Support of counters that compute the number of joins in Hive, which are converted to MapJoin after a query completion. The query results are visible in the Analyze/Workbench logs. | Hive | AWS | R58 |
Use of hive-exec
Libraries in Tez from the
local disk |
It is a Tez optimization that allows using
the hive-exec jar, which is locally available on cluster
nodes. This reduces the localization overhead and increases
efficiency by avoiding additional HDFS operations. |
Hive Tez | AWS, Azure, GCP, OPC, and Oracle | R58 |
HiveServer2 cluster with private IP address | HiveServer2 clusters use private-IP for the inter-process communication. | Hive | AWS | R58 |
Cleanup of the Partial Data upon a Hive Query Failure | In case of a Hive query failure, Qubole cleans up the partial data that completed mappers/reducers write. | Hive | AWS | R57 Quick Fix |
Hive Metastore Server with Java 8 | Use Java 8 along with G1GC (garbage collector) for the thrift Hive Metastore Server (HMS) JVM. To use this feature, remove any bootstrap code related to Java 8 for HMS. There is no need to restart HMS JVM for Java 8 to be effective. | Hive | AWS | R57 |
Spot Node Loss and Spot Blocks using graceful Decommissioning | Spark applications handle Spot Node Loss and Spot-blocks using YARN status of Graceful-Decommission. This is supported on Spark versions 2.4.0 and later versions. | Spark | AWS,GCP | R57 |
Private IP usage | Private IP addresses are used for all nodes in Spark. As a result of which the executor logs are accessible. | Spark | AWS | R56 Quick Fix |
Direct Writes for Dynamic partition overwrite in Datasource flow | Support of direct writes for improving performance for data
source tables and when OSS flag
spark.sql.sources.partitionOverwriteMode is set to
dynamic. It is supported from spark version 2.4 and later
versions. |
Spark | AWS | R57 |
Distributed Writes for better performance | Users can run SQL commands with large result size using Spark. It is supported from spark version 2.4 and later versions. | Spark | AWS | R57 |
Improved Container Packing for efficient cluster utilization | Spark on Qubole improves container packing; by restarting idle executors and thus allowing YARN to move restarted executors to fewer nodes. | Spark | AWS | R56 |
Direct Writes for Insert Overwrite with dynamic partitions queries | Support of direct writes for improving performance for Insert Overwrite with dynamic partitions queries. It is supported from spark version 2.2 and later versions. | Spark | AWS | R56 |