This repository is no longer maintained.

elyra-ai, updated 🕥 2022-03-10 22:24:34

March 2022: This repository is no longer maintained

The code for this project was merged into the https://github.com/elyra-ai/elyra repository.


airflow-notebook implements an Apache Airflow operator NotebookOp that supports running of notebooks and Python scripts in DAGs. To use the operator, configure Airflow to use the Elyra-enabled container image or install this package on the host(s) where the Apache Airflow webserver, scheduler, and workers are running.

Using the Elyra-enabled airflow container image

Follow the instructions in this document.

Installing the airflow-notebook package

You can install the airflow-notebook package from PyPI or source code.

Installing from PyPI

To install airflow-notebook from PyPI:

bash pip install airflow-notebook

Installing from source code

To build airflow-notebook from source, Python 3.6 (or later) must be installed.

bash git clone https://github.com/elyra-ai/airflow-notebook.git cd airflow-notebook make clean install

Test coverage

The operator was tested with Apache Airflow v1.10.12.

Usage

Example below on how to use the airflow operator. This particular DAG was generated with a jinja template in Elyra's pipeline editor.

```python from airflow import DAG from airflow_notebook.pipeline import NotebookOp from airflow.utils.dates import days_ago

Setup default args with older date to automatically trigger when uploaded

args = { 'project_id': 'untitled-0105163134', }

dag = DAG( 'untitled-0105163134', default_args=args, schedule_interval=None, start_date=days_ago(1), description='Created with Elyra 2.0.0.dev0 pipeline editor using untitled.pipeline.', is_paused_upon_creation=False, )

notebook_op_6055fdfb_908c_43c1_a536_637205009c79 = NotebookOp(name='notebookA', namespace='default', task_id='notebookA', notebook='notebookA.ipynb', cos_endpoint='http://endpoint.com:31671', cos_bucket='test', cos_directory='untitled-0105163134', cos_dependencies_archive='notebookA-6055fdfb-908c-43c1-a536-637205009c79.tar.gz', pipeline_outputs=[ 'subdir/A.txt'], pipeline_inputs=[], image='tensorflow/tensorflow:2.3.0', in_cluster=True, env_vars={'AWS_ACCESS_KEY_ID': 'a_key', 'AWS_SECRET_ACCESS_KEY': 'a_secret_key', 'ELYRA_ENABLE_PIPELINE_INFO': 'True'}, config_file="None", dag=dag, )

notebook_op_074355ce_2119_4190_8cde_892a4bc57bab = NotebookOp(name='notebookB', namespace='default', task_id='notebookB', notebook='notebookB.ipynb', cos_endpoint='http://endpoint.com:31671', cos_bucket='test', cos_directory='untitled-0105163134', cos_dependencies_archive='notebookB-074355ce-2119-4190-8cde-892a4bc57bab.tar.gz', pipeline_outputs=[ 'B.txt'], pipeline_inputs=[ 'subdir/A.txt'], image='elyra/tensorflow:1.15.2-py3', in_cluster=True, env_vars={'AWS_ACCESS_KEY_ID': 'a_key', 'AWS_SECRET_ACCESS_KEY': 'a_secret_key', 'ELYRA_ENABLE_PIPELINE_INFO': 'True'}, config_file="None", dag=dag, )

notebook_op_074355ce_2119_4190_8cde_892a4bc57bab << notebook_op_6055fdfb_908c_43c1_a536_637205009c79

notebook_op_68120415_86c9_4dd9_8bd6_b2f33443fcc7 = NotebookOp(name='notebookC', namespace='default', task_id='notebookC', notebook='notebookC.ipynb', cos_endpoint='http://endpoint.com:31671', cos_bucket='test', cos_directory='untitled-0105163134', cos_dependencies_archive='notebookC-68120415-86c9-4dd9-8bd6-b2f33443fcc7.tar.gz', pipeline_outputs=[ 'C.txt', 'C2.txt'], pipeline_inputs=[ 'subdir/A.txt'], image='elyra/tensorflow:1.15.2-py3', in_cluster=True, env_vars={'AWS_ACCESS_KEY_ID': 'a_key', 'AWS_SECRET_ACCESS_KEY': 'a_secret_key', 'ELYRA_ENABLE_PIPELINE_INFO': 'True'}, config_file="None", dag=dag, )

notebook_op_68120415_86c9_4dd9_8bd6_b2f33443fcc7 << notebook_op_6055fdfb_908c_43c1_a536_637205009c79 ```

Generated Airflow DAG

Airflow DAG Example

Elyra AI Toolkit

Elyra is an open source set of AI-centric extensions to JupyterLab Notebooks. The project is hosted in incubation in the LF AI & Data Foundation.

GitHub Repository Homepage

airflow airflow-dag jupyter-notebook