Schedule a Jupyter Notebook¶
-
POST
/api/v1.2/scheduler
¶
Use this API to schedule a Jupyter notebook. You can view the command’s status, result, or cancel a command using the corresponding Command API that are used for other types of command.
Note
This API is not available by default. Create a ticket with Qubole Support to enable this API on your QDS account.
Required Role¶
The following users can make this API call:
- Users who belong to the system-user or system-admin group.
- Users who belong to a group associated with a role that allows update on Jupyter Notebook and directory. See Managing Groups and Managing Roles for more information.
- Users who belong to a group associated with a role that allows create on the
Jupyter Notebook
command.
Parameters¶
Note
Parameters marked in bold below are mandatory. Others are optional and have default values.
Parameter | Description |
---|---|
name | Name for the schedule. If name is not specified, then a system-generated Schedule ID is set as the name. |
label | Label of the cluster on which the Jupyter notebook should be scheduled. |
command_type | Type of command to be executed. For Jupyter notebook, the command type is JupyterNotebookCommand . |
command | JSON object that contains path (path including name of the Jupyter notebook to be run with extension ( retry (optional): denotes the number of retries for a job. Valid values are 1, 2, and 3. retry_delay(optional): denotes the time interval (in minutes) between the retries when a job fails. arguments (optional): Valid JSON to be sent to the notebook. Specify the parameters in notebooks and pass the parameter value using the JSON format. key is the parameter’s name and value is the parameter’s value. Supported types in parameters are string, integer, float, and boolean. |
start_time | Start datetime for the schedule. In the Cron expression, the scheduler calculates the Next Materialized Time (NMT)/Start time considering the current time as the base time and Cron expression passed. Start time is not honored in the Cron expression. |
end_time | End datetime for the schedule. |
frequency | Set this option or cron_expression but do not set both options. Specify how often the schedule should run. Input is an integer.
For example, frequency of one hour/day/month is represented as {"frequency":"1"} |
time_unit | Denotes the time unit for the frequency . Its default value is days . Accepted value is minutes , hours , days , weeks , or months . |
For more information about the schedule parameters, see scheduler-api.
Request API Syntax¶
Here is the Request API syntax for scheduling a Jupyter notebook.
curl -i -X POST -H "X-AUTH-TOKEN: <token>" -H "Accept: application/json" -H "Content-type: application/json" -d \
'{"command_type":"JupyterNotebookCommand", "command": {"path":"<Path>/<Name>", "retry": 2, "retry_delay": 4, "arguments": {"key1": "value1", …, "keyN": "valueN"}}, "start_time": "2019-12-26T02:00Z","end_time": "2020-07-01T02:00Z","frequency": 1,"time_unit": "days", "label": "<ClusterLabel>"}' \
"https://gcp.qubole.com/api/v1.2/scheduler"
Sample API Request¶
curl -i -X POST -H "X-AUTH-TOKEN: $AUTH_TOKEN" -H "Accept: application/json" -H "Content-type: application/json" \
-d '{"command_type":"JupyterNotebookCommand", "command": {"path":"Users/[email protected]/note1.ipynb", "retry": 2, "retry_delay": 4, "arguments": {"name": "abc", "age": "20"}}, "start_time": "2019-12-26T02:00Z","end_time": "2020-07-01T02:00Z","frequency": 1,"time_unit": "days", "label": "spark-cluster-1"}' \
"https://gcp.qubole.com/api/v1.2/scheduler"
Known Limitation¶
If there is a warning in one of the cells when a scheduled notebook runs, the notebook stops executing at that cell.
As a workaround, to skip the warning and continue execution, add raises-exception
in that cell’s metadata field by performing the following steps:
- Select the cell that shows the warning.
- Click on the Tools icon on the left side bar.
- Click Advanced Tools.
- Add
raises-exception
in the Cell Metadata tags field. - Re-run the API.