Release Highlights¶
With R56, Qubole includes support for Google Cloud Platform (GCP).
The following are highlights of Qubole’s R56 release:
Analytics experience¶
- Added error logs API for Presto commands.
GET /api/v1.2/commands/<Command-ID>/error_logs
Data engineering¶
Airflow¶
- Latest Airflow 1.10 version is now fully supported with Python 3.5 package management.
Custom Metastore¶
- You can now connect to any remote metastore server using the Thrift URI.
Support for multiple notification channels in Scheduler¶
- You can now configure and send alerts to multiple end-points, such as Slack, PagerDuty, Email, other webhooks.
Data science¶
- Notebook usability improvements for paragraphs such as active indicators, compact size to fit query, etc.
Administration and TCO¶
- Added user email in commands API response to help admins check usage easily.
GET /api/v1.2/commands/
Engines¶
Spark
- Enhanced Broadcast Joins: Introducing Executor-based broadcast, where values to be broadcasted are not collected on the driver. This ensures that driver memory is not a bottleneck for using broadcast, enabling users with lower driver memory to do broadcast.
- Cost Based Optimizer (CBO) Support: We enabled CBO support by default in Spark version 2.4.0 that uses table statistics to optimize the query for performance.
- Hint-based Skew Join Support: Users can now specify hints for skewed columns and values for a join. Spark automatically allocates more resources to the skewed value based on the hint.
Spark Structured Streaming
- Streaming State Store: New state storage management using RocksDB in Spark Structured Streaming. This is designed for better scalability, latency in stateful processing such as stream joins, and deduplication.
Hive
- Added MySql 8.x support as Hive Metastore.