The code for this project was merged into the https://github.com/elyra-ai/elyra repository.
airflow-notebook
implements an Apache Airflow operator NotebookOp
that supports running of notebooks and Python scripts in DAGs.
To use the operator, configure Airflow to use the Elyra-enabled container image or install this package on the host(s) where the Apache Airflow webserver, scheduler, and workers are running.
Follow the instructions in this document.
You can install the airflow-notebook
package from PyPI or source code.
To install airflow-notebook
from PyPI:
bash
pip install airflow-notebook
To build airflow-notebook
from source, Python 3.6 (or later) must be installed.
bash
git clone https://github.com/elyra-ai/airflow-notebook.git
cd airflow-notebook
make clean install
The operator was tested with Apache Airflow v1.10.12.
Example below on how to use the airflow operator. This particular DAG was generated with a jinja template in Elyra's pipeline editor.
```python from airflow import DAG from airflow_notebook.pipeline import NotebookOp from airflow.utils.dates import days_ago
args = { 'project_id': 'untitled-0105163134', }
dag = DAG( 'untitled-0105163134', default_args=args, schedule_interval=None, start_date=days_ago(1), description='Created with Elyra 2.0.0.dev0 pipeline editor using untitled.pipeline.', is_paused_upon_creation=False, )
notebook_op_6055fdfb_908c_43c1_a536_637205009c79 = NotebookOp(name='notebookA', namespace='default', task_id='notebookA', notebook='notebookA.ipynb', cos_endpoint='http://endpoint.com:31671', cos_bucket='test', cos_directory='untitled-0105163134', cos_dependencies_archive='notebookA-6055fdfb-908c-43c1-a536-637205009c79.tar.gz', pipeline_outputs=[ 'subdir/A.txt'], pipeline_inputs=[], image='tensorflow/tensorflow:2.3.0', in_cluster=True, env_vars={'AWS_ACCESS_KEY_ID': 'a_key', 'AWS_SECRET_ACCESS_KEY': 'a_secret_key', 'ELYRA_ENABLE_PIPELINE_INFO': 'True'}, config_file="None", dag=dag, )
notebook_op_074355ce_2119_4190_8cde_892a4bc57bab = NotebookOp(name='notebookB', namespace='default', task_id='notebookB', notebook='notebookB.ipynb', cos_endpoint='http://endpoint.com:31671', cos_bucket='test', cos_directory='untitled-0105163134', cos_dependencies_archive='notebookB-074355ce-2119-4190-8cde-892a4bc57bab.tar.gz', pipeline_outputs=[ 'B.txt'], pipeline_inputs=[ 'subdir/A.txt'], image='elyra/tensorflow:1.15.2-py3', in_cluster=True, env_vars={'AWS_ACCESS_KEY_ID': 'a_key', 'AWS_SECRET_ACCESS_KEY': 'a_secret_key', 'ELYRA_ENABLE_PIPELINE_INFO': 'True'}, config_file="None", dag=dag, )
notebook_op_074355ce_2119_4190_8cde_892a4bc57bab << notebook_op_6055fdfb_908c_43c1_a536_637205009c79
notebook_op_68120415_86c9_4dd9_8bd6_b2f33443fcc7 = NotebookOp(name='notebookC', namespace='default', task_id='notebookC', notebook='notebookC.ipynb', cos_endpoint='http://endpoint.com:31671', cos_bucket='test', cos_directory='untitled-0105163134', cos_dependencies_archive='notebookC-68120415-86c9-4dd9-8bd6-b2f33443fcc7.tar.gz', pipeline_outputs=[ 'C.txt', 'C2.txt'], pipeline_inputs=[ 'subdir/A.txt'], image='elyra/tensorflow:1.15.2-py3', in_cluster=True, env_vars={'AWS_ACCESS_KEY_ID': 'a_key', 'AWS_SECRET_ACCESS_KEY': 'a_secret_key', 'ELYRA_ENABLE_PIPELINE_INFO': 'True'}, config_file="None", dag=dag, )
notebook_op_68120415_86c9_4dd9_8bd6_b2f33443fcc7 << notebook_op_6055fdfb_908c_43c1_a536_637205009c79 ```
Elyra is an open source set of AI-centric extensions to JupyterLab Notebooks. The project is hosted in incubation in the LF AI & Data Foundation.
GitHub Repository Homepageairflow airflow-dag jupyter-notebook