Canonical Hive Commands Report¶
-
GET
/api/v1.2/reports/canonical/_hive/_commands/
¶
This API provides you the canonical Hive commands report in JSON format. Currently, this report is not generated and Qubole intends to provide this report very soon.
Note
The following points are related to a report API:
- If the difference between start date and end date is more than 60 days, then the system defaults to 1 month window from the current day’s date.
- If either start date or end date is not provided, then the system defaults to 1 month window from the current day’s date.
- If you want to get data for a window more than 2 months, then write an email to help@quoble.com.
Required Role¶
The following users can make this API call:
- Users who belong to the system-admin or system-user group.
- Users who belong to a group associated with a role that allows viewing canonical Hive commands reports. See Managing Groups and Managing Roles for more information.
Parameters¶
Parameter | Description |
---|---|
start_date | The date from which you want the report (inclusive). This parameter supports the timestamp in the UTC timezone (YYYY-MM-DDTHH:MM:SSZ) format. The date cannot be earlier than 90 days. |
end_date | The date until which you want the report. The report contains data from this date also. The API default is today or now. This parameter also supports timestamp in the UTC timezone (YYYY-MM-DDTHH:MM:SSZ) format. |
offset | The starting point of the results. The API default is 0. |
limit | The number of results to fetch. The API default is 10. |
sort_column | The column used to sort the report. Since this report returns the top canonical_hive_commands, the sort order is always descending. Valid choices are ‘frequency’, ‘cpu’, ‘fs_bytes_read’, and ‘fs_bytes_written’. The API default is, frequency. |
show_ast | Also return the serialized AST corresponding to the canonical query. (Not returned by default.) |
Response Parameters¶
Parameter | Description |
---|---|
start_date | The actual starting date of the report. |
end_date | The actual ending date of the report. |
sort_column | The sort column used. |
An array of:
canonical_query_id | The ID of the canonical query. |
canonical_query | The AST dump of the canonical query. (This is returned only when the show_ast parameter is passed.) |
recent_example | The most recent example of this type of queries. |
frequency | The number of queries of this type. |
cpu | The total cumulative CPU, (in ms), consumed by these queries. |
fs_bytes_read | The total bytes read by these queries. |
fs_bytes_written | The total bytes written by these queries. |
Examples¶
Without any parameters¶
curl -i -X GET -H "X-AUTH-TOKEN: $AUTH_TOKEN" -H "Accept: application/json" \
"https://gcp.qubole.com/api/v1.2/reports/canonical_hive_commands"
Sample Response
{
"sort_column": "frequency",
"canonical_queries": [
{
"canonical_query_id": "af09cd5799e52f450a87e236f453b864833afac97603409a17f3df4d010b1814",
"fs_bytes_written": 0,
"fs_bytes_read": 0,
"frequency": 8800,
"cpu": 0,
"recent_example": "alter table demo_memetracker recover partitions"
},
{
"canonical_query_id": "9548ac7ec7defe3c3251da2544ec545c9bd578cb308c6c3c1936e48df0bdfdb4",
"fs_bytes_written": 0,
"fs_bytes_read": 0,
"frequency": 4547,
"cpu": 0,
"recent_example": "alter table daily_tick_data recover partitions"
},
{
"canonical_query_id": "69dae07fc876927b9daba6279c962bc343131c08d8f9f98adfa0c05ef90b40a4",
"fs_bytes_written": 0,
"fs_bytes_read": 0,
"frequency": 768,
"cpu": 0,
"recent_example": "show tables"
},
{
"canonical_query_id": "89720fe23d2a85ac217a3b230e992c45dd523b65e6d45863cc410f4b5e4795ea",
"fs_bytes_written": 0,
"fs_bytes_read": 0,
"frequency": 53,
"cpu": 0,
"recent_example": "select * from `default`.`memetracker` limit 400"
},
{
"canonical_query_id": "04bccd848172842c8fadd687aef72ac2161f72895dfd3c1d3c31a96411d34095",
"fs_bytes_written": 0,
"fs_bytes_read": 0,
"frequency": 49,
"cpu": 0,
"recent_example": "select * from `default`.`30days_test` limit 1 "
},
{
"canonical_query_id": "9996d665cf077f60b1ee87d3b5b80cd65ce078f77935096441433628909b9ddb",
"fs_bytes_written": 9,
"fs_bytes_read": 28482500000,
"frequency": 48,
"cpu": 0,
"recent_example": "select count(*) from memetracker"
},
{
"canonical_query_id": "492ea35ff3d58d0d07e70bcc68ea33eadb7bb572f6fdf14f7220931cc94b1abc",
"fs_bytes_written": 0,
"fs_bytes_read": 0,
"frequency": 37,
"cpu": 0,
"recent_example": "select * from `default`.`default_qubole_airline_origin_destination` limit 1000"
},
{
"canonical_query_id": "5e2bb3326c4cf2c5b669d60729c5697fb6fdef4f1e09be41e37f424eb96b0c74",
"fs_bytes_written": 0,
"fs_bytes_read": 0,
"frequency": 30,
"cpu": 0,
"recent_example": "select * from default_qubole_memetracker limit 10;"
},
{
"canonical_query_id": "aa1c7a294f6e18feec68175a643814f06e180f4ac5e62eb6b556d9bf72830bc2",
"fs_bytes_written": 25326,
"fs_bytes_read": 85944,
"frequency": 22,
"cpu": 0,
"recent_example": "select * from test_csv limit 5;"
},
{
"canonical_query_id": "2145af0ee70e1cd93c9901cd41dee8285faacaefe86e4a4f22880316cc4e63c3",
"fs_bytes_written": 21050200,
"fs_bytes_read": 321138000,
"frequency": 21,
"cpu": 0,
"recent_example": "select * from demo_memetracker limit 100"
}
]
}
With a different sort column and limit and show_ast=true¶
curl -i -X GET -H "X-AUTH-TOKEN: $AUTH_TOKEN" -H "Accept: application/json" \ "https://gcp.qubole.com/api/v1.2/reports/canonical_hive_commands?sort_column=cpu&limit=2&show_ast=true"
Sample Response
{
"sort_column": "cpu",
"canonical_queries": [
{
"canonical_query_id": "d9635e3ad5501c9ad47bb728c35b63e1b41f8c2ba0fb4f7533b9ab701ce503c4",
"canonical_query": "(null(TOK_QUERY(TOK_FROM(TOK_TABREF(TOK_TABNAME(lineitem))))(TOK_INSERT(TOK_DESTINATION(TOK_DIR(TOK_TMP_FILE)))(TOK_SELECT(TOK_SELEXPR(TOK_FUNCTIONSTAR(count))))))())",
"fs_bytes_written": 18,
"fs_bytes_read": 7726360000,
"frequency": 2,
"cpu": 423130,
"recent_example": "set fs.s3.inputpathprocessor=false;\nselect count(*) from lineitem;"
},
{
"canonical_query_id": "ea1a37cf6b4694293d15cadfaf4bbae2459f12475cd86ea90e6c4f8e31945bda",
"canonical_query": "(null(TOK_QUERY(TOK_FROM(TOK_TABREF(TOK_TABNAME(default_qubole_memetracker))))(TOK_INSERT(TOK_DESTINATION(TOK_DIR(TOK_TMP_FILE)))(TOK_SELECT(TOK_SELEXPR(TOK_FUNCTIONSTAR(count))))(TOK_WHERE(=(TOK_TABLE_OR_COL(month))(LITERAL)))))())",
"fs_bytes_written": 108,
"fs_bytes_read": 54673600000,
"frequency": 20,
"cpu": 416200,
"recent_example": "SELECT count(*) FROM default_qubole_memetracker where month = '2008-08';"
}
]
}
For a specific time period¶
curl -i -X GET -H "X-AUTH-TOKEN: $AUTH_TOKEN" -H "Accept: application/json" \ "https://gcp.qubole.com/api/v1.2/reports/canonical_hive_commands?start_date=2014-04-01&end_date=2014-04-21&sort_column=fs_bytes_read&limit=2"
Sample Response
{
"canonical_queries": [
{
"canonical_query_id": "55ebb0cc47e0dc74c70245b026126bba191969dee1dc380a6f98698e6b194085",
"cpu": 75720,
"frequency": 1,
"recent_example": "select dt, count(*) from junk_temp \ngroup by dt order by dt\n\n\n",
"fs_bytes_read": 308582016,
"fs_bytes_written": 1514
},
{
"canonical_query_id": "1c421d2e65407650650cbc2ee80f9a59863875f52d1a9ddd5a051118678a3a6c",
"cpu": 34980,
"frequency": 1,
"recent_example": "select created_at from junk_temp \nwhere dt=2014-01-03 limit 10\n\n",
"fs_bytes_read": 308582016,
"fs_bytes_written": 0
}
],
"start_date": "2014-03-31T10:00:00Z",
"end_date": "2014-04-21T20:00:00Z",
"sort_column": "fs_bytes_read"
}
To learn more about canonicalization of hive queries, see the Blog post.
Caution
Qubole started collecting the CPU metrics only from the last week of December, 2013. So, if you have some queries before that, the CPU metrics is considered as 0.