Globally Rolled Out Features/EnhancementsΒΆ

Qubole enables certain features/enhancements as part of its gradual rollout program to different pods over a period of time. After Qubole rolls out such features/enhancements, it globally enables them on the Qubole platform.

The following table provides the list of features that are globally rolled out for you to use.

Feature/Enhancement Feature Description Qubole Component Supported Cloud Provider QDS Release Version
Auto-population of instances similar to primary worker node type Enhancement in the heterogeneous cluster configuration UI that suggests instances similar to the chosen worker node type but from different generations instead of suggesting the instance of double weight of the same generation. Cluster Management AWS R58 Quick Fix
Optimized version of the Beeline script It is an optimization that reduces the latency of HiveServer2 queries. Hive AWS R58
Hive MapJoin Counters Computation Support of counters that compute the number of joins in Hive, which are converted to MapJoin after a query completion. The query results are visible in the Analyze/Workbench logs. Hive AWS R58
Use of hive-exec Libraries in Tez from the local disk It is a Tez optimization that allows using the hive-exec jar, which is locally available on cluster nodes. This reduces the localization overhead and increases efficiency by avoiding additional HDFS operations. Hive Tez AWS, Azure, GCP, OPC, and Oracle R58
HiveServer2 cluster with private IP address HiveServer2 clusters use private-IP for the inter-process communication. Hive AWS R58
Cleanup of the Partial Data upon a Hive Query Failure In case of a Hive query failure, Qubole cleans up the partial data that completed mappers/reducers write. Hive AWS R57 Quick Fix
Hive Metastore Server with Java 8 Use Java 8 along with G1GC (garbage collector) for the thrift Hive Metastore Server (HMS) JVM. To use this feature, remove any bootstrap code related to Java 8 for HMS. There is no need to restart HMS JVM for Java 8 to be effective. Hive AWS R57
Spot Node Loss and Spot Blocks using graceful Decommissioning Spark applications handle Spot Node Loss and Spot-blocks using YARN status of Graceful-Decommission. This is supported on Spark versions 2.4.0 and later versions. Spark AWS,GCP R57
Private IP usage Private IP addresses are used for all nodes in Spark. As a result of which the executor logs are accessible. Spark AWS R56 Quick Fix
Direct Writes for Dynamic partition overwrite in Datasource flow Support of direct writes for improving performance for data source tables and when OSS flag spark.sql.sources.partitionOverwriteMode is set to dynamic. It is supported from spark version 2.4 and later versions. Spark AWS R57
Distributed Writes for better performance Users can run SQL commands with large result size using Spark. It is supported from spark version 2.4 and later versions. Spark AWS R57
Improved Container Packing for efficient cluster utilization Spark on Qubole improves container packing; by restarting idle executors and thus allowing YARN to move restarted executors to fewer nodes. Spark AWS R56
Direct Writes for Insert Overwrite with dynamic partitions queries Support of direct writes for improving performance for Insert Overwrite with dynamic partitions queries. It is supported from spark version 2.2 and later versions. Spark AWS R56