Presto as a Service¶
Qubole provides Presto as a service for fast, inexpensive, and scalable data processing.
Note
For the latest information on QDS support for Presto, see QDS Components: Supported Versions and Cloud Platforms.
Supported Data Formats¶
Presto supports the following data formats:
- Hive tables in the Cloud and HDFS.
- Delimited, CSV, RCFile, JSON, SequenceFile, ORC, Avro, and Parquet. Other file formats are also supported by adding relevant jars to Presto through the Presto Server Bootstrap.
- Data-compressed using GZIP.
Advantages of QDS Presto Clusters¶
- You can optimize your clusters by choosing the instance type most suitable to your workload.
- You can launch clusters in any region or location.
- QDS provides Cloud-specific optimizations.
- By default, QDS automatically terminates idle clusters to save cost.
- QDS starts clusters only when necessary– when a query is run and no Presto cluster is running; otherwise QDS reuses a cluster that is already running.
- Autoscaling continuously adjusts the cluster size to the Presto workload.
- You can configure the amount of cluster memory allocated for Presto.
A Better User Experience¶
- Multiple QDS users can submit queries to the same Presto cluster.
- Query logs and results are always available (use the History tab on the Workbench page of the QDS UI).
- QDS provides detailed execution metrics for each Presto query.
- Users can create workflows that combine Hadoop jobs, Hive queries, and Presto queries.
Security¶
QDS can provide table-level security for Hive tables accessed via Presto; to enable it, set hive.security
to
sql-standard
in catalog/hive.properties. See Understanding Qubole Hive Authorization for more information.