Presto¶
The new features and key enhancements are:
- Presto 317 is Generally Available
- BigQuery Connector for Presto
- Dynamic Concurrency and Hybrid Autoscaling
- Enhancements in Presto for JDBC and ODBC Drivers
- Improvements in Dynamic Filtering
- Improvements in Reading Hive ACID Tables
- Changes in Datadog Alerts
- Enforcing Group Quotas in Resource Group-based Dynamic Cluster Sizing
Other enhancements and bug fixes are listed in:
Presto 317 is Generally Available¶
PRES-3429: Presto version 317 is generally available. Cluster Restart Required
BigQuery Connector for Presto¶
PRES-3153: The BigQuery connector is now available in Presto version 317.
Dynamic Concurrency and Hybrid Autoscaling¶
PRES-3373: Changes to workload-aware autoscaling include dynamic concurrency and queue-aware autoscaling in conjunction with CPU-based autoscaling. Automated workload management and related changes improve performance, reliability, and TCO. Gradual Rollout | Cluster Restart Required
Enhancements in Presto for JDBC and ODBC Drivers¶
Enhancements in Presto for next generation (v3) JDBC and ODBC drivers are designed to make these drivers as fast as open source drivers and to:
- Support cluster lifecycle management (auto start cluster when a query is submitted and auto terminate idle clusters)
- Provide query history available in Analyze and Workbench UI
- Provide enhanced security (HTTPs) and user authentication (through API token)
Improvements in Dynamic Filtering¶
PRES-3288: Dynamic filtering (DF) improvements include the following:
- PRES-3002: A new configuration property,
hive.max-execution-partitions-per-scan
, limits the maximum number of partitions that a table scan is allowed to read during query execution. Disabled | Cluster Restart Required - PRES-3148: Extends DF optimization to semi-joins to take advantage of a selective build side in
queries with the
IN
clause. - PRES-3149: Pushes dynamic filters down to ORC and Parquet readers to reduce data scanned on the probe side for partitioned as well as non-partitioned tables. Cluster Restart Required
- PRES-3404: Improves utilization of dynamic filters on worker nodes and reduces the load on the coordinator when dynamic filtering is enabled.
Improvements in Reading Hive ACID Tables¶
- PRES-2840: Because Hive 2.0-versioned ACID transactional tables are not supported in Presto 317, QDS has added checks to fail queries using such tables.
- PRES-3320: QDS has added checks to fail Presto queries on Hive ACID tables when the Hive metastore server’s version is older than 3.0.
Changes in Datadog Alerts¶
Qubole has added these Datadog alerts:
- PRES-3360:Adds a Datadog alert to detect runaway splits occupying execution slots for more than 10 minutes,
removes the
presto.jmx.qubole.request_failures
metric from the default Datadog dashboard, and removes the Datadog alert for CPU utilization over 80%. - PRES-3468: Adds a Datadog alert to detect if the
Coordinator Average Heap Memory Usage
is more than 90%. - PRES-3508: Adds a Datadog alert to detect if the coordinator’s Presto server open file descriptor has exceeded its limit.
Enforcing Group Quotas in Resource Group-based Dynamic Cluster Sizing¶
PRES-3194: In resource-based dynamic cluster sizing, QDS now enforces individual resource group quotas for CPU resources even when the cluster autoscales to the union of two resource group quotas.
Enhancements¶
- PRES-3257: Presto now supports removing unhealthy nodes on the basis of disk usage. The coordinator node periodically
monitors disk usage on worker nodes and gracefully shuts down nodes that have exceeded a
threshold that defaults to 0.9. You can change the threshold value by means of the
ascm.bad-node-removal.disk-usage-max-threshold
parameter; the supported range is 0.0 - 1.0. Beta | Cluster Restart Required - PRES-3273: Improvements in Presto Ranger integration:
- Support for column masking HASH for Ranger.
- Support for the Solr audit store. You can enable auditing in the
ranger.<catalog>.audit-config-xml
as described in ranger-plugin-config. Disabled
- PRES-3353:
QueryHistID
is now returned as part of the error message for queries executed through cloud-agnostic drivers ifshow_on_ui
is set totrue
for these drivers.QueryHistID
is useful in debugging. Qubole plans to provide cloud-agnostic drivers shortly. - PRES-3469: Backports open-source fixes to improve the performance of inequality JOINs that involve
BETWEEN
andGROUP BY
queries.
Bug Fixes¶
- PRES-1799: Presto now returns the number of files written during an
INSERT OVERWRITE DIRECTORY
(IOD) query inQueryInfo
. The Presto client in the QDS Control Plane waits for this information to display the returned number of files at the IOD location. This fixes eventual consistency issues in reading query results through the QDS UI. - PRES-3411: Qubole has fixed the
UnsupportedOperationException
that occurred in certain multi-join queries with dynamic filtering enabled. - PRES-3544: Fixes a problem that caused dynamic filtering not to work on SSL-enabled clusters.