Notebooks and Dashboards

Jupyter Notebooks (Beta)

New Features

  • JUPY-199: You can now schedule Jupyter notebooks, set custom parameters for the schedule, and view schedules and their execution history from the JupyterLab interface. Via Support.
  • JUPY-197, JUPY-356 and JUPY-289 : Jupyter Notebooks are now integrated with GitHub, GitLab, and BitBucket (Cloud only). You can use these version-control systems to manage notebook versions, synchronize notebooks with public and private repositories, view and compare notebook versions, and create pull requests. Via Support.
  • JUPY-195 and JUPY-334: You can now configure access control for Jupyter notebooks at both the account and object levels. Users with the system-admin role, or other roles with appropriate permissions, can configure the Jupyter Notebook resource at the account level. Notebook users can override these permissions for the objects that they own. Via Support.
  • JUPY-251: Users with appropriate permissions can now gain access to Jupyter notebooks via shareable links. Get a shareable link as follows:
    1. In the UI, navigate to the File Browser sidebar panel.
    2. Select the required notebook and right-click.
    3. From the resulting menu, select Copy Shareable Link.
  • JUPY-193: You can now create and manage Jupyter notebooks using REST APIs.

Enhancements

  • JUPY-319: You should now specify the notebook/folder name when creating a notebook instead of using the default (Untitled*).
  • JUPY-272: You can now use the QDS Object Storage Explorer on the left side bar of the JupyterLab UI to explore Cloud storage, and to perform actions such as uploading or downloading a file.
  • JUPY-271: You can now use the Table Explorer from the left sidebar to explore the Hive metastore, schema, tables, and columns.
  • JUPY-234: Using its context menu (right-click), you can now copy a sample Jupyter notebook and paste it into the File Browser sidebar panel.
  • JUPY-413: To prevent Livy session timeout for long-running notebooks, you can now configure the kernel and Livy session idle timeout using spark.qubole.idle.timeout. Set this in the Override Spark Configuration field under the Advanced Configuration tab of the Clusters page for the attached Spark cluster. You can set it to an integer value (in minutes), or -1 for no timeout.

Bug Fixes

  • JUPY-332: Module not found errors occurred when trying to import code from the bootstrapped custom zip files in notebooks. This issue is fixed.
  • JUPY-308: Spark application startup used to fail when third party JARs were added to the Spark configuration. This issue is fixed.
  • JUPY-229: The scope of Jupyter magic commands for the listing and clearing of sessions has been limited to only the sessions of the user executing the command, so that the sessions of other users remain unchanged.
  • JUPY-214: Spark applications that are stuck in the Accepted state when the session cannot not be started are now terminated when the timeout is reached.

Zeppelin Notebooks

New Features

  • ZEP-493: Bitbucket is now integrated with Notebooks. You can use Bitbucket to manage versions of your notebooks. Learn more.
  • ZEP-3792: New package management architecture (Python 2.7 and 3.7 with R 3.5) is now available on GCP.

Enhancements

  • ZEP-3915: Zeppelin 0.8.0 includes he following enhancements:

    • ZEP-2749: Pyspark and IPyspark interpreters are now supported with IPython as the default shell. To set the Python shell as the default for the Pyspark interpreter, set zeppelin.pyspark.useIPython to false in the Interpreter settings. Via Support.
    • ZEP-4077: Notebooks now support z.run(noteId, paragraphId) and z.runNote(noteId) functions to run paragraphs or notebooks from within the notebook.
    • ZEP-3317: You can now run Markdown (%md) paragraphs in edit mode even when the cluster is down.
    • ZEP-1908: The geolocation graph type is now available in the UI by default.
  • ZEP-4129: To optimize memory usage for homogeneous Spark clusters running version 2.3.2 and later, Spark driver memory is now allocated on the basis of the instance type of the cluster worker nodes. Because of this, when configuring a cluster to attach to a notebook, you should not specify the spark.driver.memory property in the Spark cluster overrides (Override Spark Configuration under the Advanced tab). A homogeneous cluster is one in which all the worker nodes are of the same instance type.

  • ZEP-4169 and ZEP-134: The Zeppelin application and all interpreters started by Zeppelin (including Spark and shell interpreters) can now be run under a YARN user. Before this release, Zeppelin applications and interpreters ran as root, a security concern for many enterprises. Via Support.

Bug Fixes

  • ZEP-3298: In case of failure, scheduled notebooks with retry no.option set in the Scheduler properties were not re-run. This issue is fixed.
  • ZEP-4193: Autocomplete now works for PySpark notebooks in Zeppelin 0.8.0.
  • ZEP-4194: Notebook results were not displayed when clusters running Zeppelin 0.6.0 were upgraded to Zeppelin 0.8.0 or when clusters running Zeppelin 0.8.0 were downgraded to Zeppelin 0.6.0. This issue is fixed.
  • ZEP-3122: The stacked option for graphs and charts in Zeppelin notebooks did not persist after a refresh. This issue is fixed.
  • ZEP-4198: The Notebooks home page was displayed when the cluster was started. This issue is fixed.
  • ZEP-3129: External web links referenced in a Markdown paragraph now open in a separate tab.
  • ZEP-4195 and ZEP-4199: Notebook content was not rendered correctly when a notebook was switched to a different cluster. This issue is fixed.
  • ZEP-4181: The published at field in the Dashboard information on the Notebooks page displayed an incorrect timestamp. This issue is fixed.
  • ZEP-4004 and ZEP-3562: With a large cardinality in multibar charts, a notebook becomes unresponsive. This fix sets a limit of 50 on cardinality. To increase the limit, contact Qubole Support.