Introduction¶

RubiX is a light-weight data caching framework that can be used by a Big Data system that uses Hadoop filesystem interface. RubiX is designed to work with cloud storage systems such as AWS S3 and Azure Blob Storage. RubiX can be extended through plugins to support any engine that uses the Hadoop filesystem interface for access to data in any cloud object storage.

Note

Rubix is only supported on Presto clusters that use GCP.

For more information, see these blogs:

Open-Source RubiX¶

Qubole has open-sourced RubiX. The documentation is here:

Using RubiX in the QDS UI¶

For instructions on configuring and using RubiX via the QDS UI, see: