This package provides a JupyterLab extension to manage Dask clusters, as well as embed Dask's dashboard plots directly into JupyterLab panes.
JupyterLab >= 1.0 distributed >= 1.24.1
To install the Dask JupyterLab extension you will need to have JupyterLab installed. For JupyterLab < 3.0, you will also need Node.js version >= 12. These are available through a variety of sources. One source common to Python users is the conda package manager.
bash
conda install jupyterlab
conda install -c conda-forge nodejs
You should be able to install this extension with pip or conda, and start using it immediately, e.g.
bash
pip install dask-labextension
This extension includes both client-side and server-side components. Prior to JupyterLab 3.0 these needed to be installed separately, with node available on the machine.
The server-side component can be installed via pip or conda-forge:
bash
pip install dask_labextension
bash
conda install -c conda-forge dask-labextension
You then build the client-side extension into JupyterLab with:
bash
jupyter labextension install dask-labextension
If you are running Notebook 5.2 or earlier, enable the server extension by running
bash
jupyter serverextension enable --py --sys-prefix dask_labextension
This extension has the ability to launch and manage several kinds of Dask clusters,
including local clusters and kubernetes clusters.
Options for how to launch these clusters are set via the
dask configuration system,
typically a .yml
file on disk.
By default the extension launches a LocalCluster
, for which the configuration is:
yaml
labextension:
factory:
module: 'dask.distributed'
class: 'LocalCluster'
args: []
kwargs: {}
default:
workers: null
adapt:
null
# minimum: 0
# maximum: 10
initial:
[]
# - name: "My Big Cluster"
# workers: 100
# - name: "Adaptive Cluster"
# adapt:
# minimum: 0
# maximum: 50
In this configuration, factory
gives the module, class name, and arguments needed to create the cluster.
The default
key describes the initial number of workers for the cluster, as well as whether it is adaptive.
The initial
key gives a list of initial clusters to start upon launch of the notebook server.
In addition to LocalCluster
, this extension has been used to launch several other Dask cluster
objects, a few examples of which are:
yaml
labextension:
factory:
module: 'dask_jobqueue'
class: 'SLURMCluster'
args: []
kwargs: {}
yaml
labextension:
factory:
module: 'dask_jobqueue'
class: 'PBSCluster'
args: []
kwargs: {}
yaml
labextension:
factory:
module: dask_kubernetes
class: KubeCluster
args: []
kwargs: {}
This extension can store a default layout for the Dask dashboard panes, which is useful if you find yourself reaching for the same dashboard charts over and over. You can launch the default layout via the command palette, or by going to the File menu and choosing "Launch Dask Dashboard Layout".
Default layouts can be configured via the JupyterLab config system
(either using the JSON editor or the user interface).
Specify a layout by writing a JSON object keyed by the
individual charts
you would like to open.
Each chart is opened with a mode
, and a ref
.
mode
refers to how the chart is to be added to the workspace.
For example, if you want to split a panel and add the new one to the right, choose split-right
.
Other options are split-top
, split-bottom
, split-left
, tab-after
, and tab-before
.
ref
refers to the panel to which mode
is applied, and might be the names of other dashboard panels.
If ref
is null
, the panel in question is added at the top of the layout hierarchy.
A concrete example of a default layout is
json
{
"individual-task-stream": {
"mode": "split-right",
"ref": null
},
"individual-workers-memory": {
"mode": "split-bottom",
"ref": "individual-task-stream"
},
"individual-progress": {
"mode": "split-right",
"ref": "individual-workers-memory"
}
}
which adds the task stream to the right of the workspace, then adds the worker memory chart below the task stream, then adds the progress chart to the right of the worker memory chart.
As described in the JupyterLab documentation for a development install of the labextension you can run the following in this directory:
bash
jlpm # Install npm package dependencies
jlpm build # Compile the TypeScript sources to Javascript
jupyter labextension develop . --overwrite # Install the current directory as an extension
To rebuild the extension:
bash
jlpm build
You should then be able to refresh the JupyterLab page and it will pick up the changes to the extension.
To run an editable install of the server extension, run
bash
pip install -e .
jupyter serverextension enable --sys-prefix dask_labextension
This application is distributed as two subpackages.
The JupyterLab frontend part is published to npm, and the server-side part to PyPI.
Releases for both packages are done with the jlpm
tool, git
and Travis CI.
Note: Package versions are not prefixed with the letter v
. You will need to disable this.
console
$ jlpm config set version-tag-prefix ""
Making a release
console
$ jlpm version [--major|--minor|--patch] # updates package.json and creates git commit and tag
$ git push upstream main && git push upstream main --tags # pushes tags to GitHub which triggers Travis CI to build and deploy
Bumps webpack from 5.73.0 to 5.76.1.
Sourced from webpack's releases.
v5.76.1
Fixed
- Added
assert/strict
built-in toNodeTargetPlugin
Revert
- Improve performance of
hashRegExp
lookup by@ryanwilsonperkin
in webpack/webpack#16759v5.76.0
Bugfixes
- Avoid cross-realm object access by
@Jack-Works
in webpack/webpack#16500- Improve hash performance via conditional initialization by
@lvivski
in webpack/webpack#16491- Serialize
generatedCode
info to fix bug in asset module cache restoration by@ryanwilsonperkin
in webpack/webpack#16703- Improve performance of
hashRegExp
lookup by@ryanwilsonperkin
in webpack/webpack#16759Features
- add
target
toLoaderContext
type by@askoufis
in webpack/webpack#16781Security
- CVE-2022-37603 fixed by
@akhilgkrishnan
in webpack/webpack#16446Repo Changes
- Fix HTML5 logo in README by
@jakebailey
in webpack/webpack#16614- Replace TypeScript logo in README by
@jakebailey
in webpack/webpack#16613- Update actions/cache dependencies by
@piwysocki
in webpack/webpack#16493New Contributors
@Jack-Works
made their first contribution in webpack/webpack#16500@lvivski
made their first contribution in webpack/webpack#16491@jakebailey
made their first contribution in webpack/webpack#16614@akhilgkrishnan
made their first contribution in webpack/webpack#16446@ryanwilsonperkin
made their first contribution in webpack/webpack#16703@piwysocki
made their first contribution in webpack/webpack#16493@askoufis
made their first contribution in webpack/webpack#16781Full Changelog: https://github.com/webpack/webpack/compare/v5.75.0...v5.76.0
v5.75.0
Bugfixes
experiments.*
normalize tofalse
when opt-out- avoid
NaN%
- show the correct error when using a conflicting chunk name in code
- HMR code tests existance of
window
before trying to access it- fix
eval-nosources-*
actually exclude sources- fix race condition where no module is returned from processing module
- fix position of standalong semicolon in runtime code
Features
- add support for
@import
to extenal CSS when using experimental CSS in node
... (truncated)
21be52b
Merge pull request #16804 from webpack/chore-patch-release1cce945
chore(release): 5.76.1e76ad9e
Merge pull request #16803 from ryanwilsonperkin/revert-16759-real-content-has...52b1b0e
Revert "Improve performance of hashRegExp lookup"c989143
Merge pull request #16766 from piranna/patch-1710eaf4
Merge pull request #16789 from dmichon-msft/contenthash-hashsalt5d64468
Merge pull request #16792 from webpack/update-version67af5ec
chore(release): 5.76.097b1718
Merge pull request #16781 from askoufis/loader-context-target-typeb84efe6
Merge pull request #16759 from ryanwilsonperkin/real-content-hash-regex-perfThis version was pushed to npm by evilebottnawi, a new releaser for webpack since your current version.
Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase
.
Describe the issue: After pressing +NEW button to start a cluster:
As usual, this "WORKED BEFORE" (TM) on this machine, but I hadn't used it for several weeks. :(
Minimal Complete Verifiable Example:
jupyter lab .
in a folder with work projects (sub folders with notebooks in them)Anything else we need to know?:
I saw that a new py311 env automatically gets bokeh > 3, so I mamba-installed bokeh<3 to make dashboards work.
Environment:
Describe the issue:
I am able to create clusters, connect using dask clients and perform Dask operations without issues using KubeCluster Operator on a Notebook. I am also able to connect to the status dashboard using port-forwarding to the scheduler.
However, I am not able to connect to these clusters when using the lab extensions. When I try to move to an active notebook and click search on the Dask Lab-extension, it does picks up a remote cluster address. The Dashboard URLs that are picked up the extension code look like,
http://internal-scheduler.namespace:8787/
But, I think the extension is not able to connect to it. I do not see any logs pertaining to this action.
Do these dashboards need to be external (meaning are these connections made from browser or backend service)? Since I was not sure about this, I tried setting up AWS NLB. I tried connecting to the NLB address using the Client as seen in the second snippet below.
Minimal Complete Verifiable Example:
All of the following code snippets work fine from the notebook.
```python
from dask_kubernetes.operator import make_cluster_spec, make_worker_spec from dask_kubernetes.operator import KubeCluster from dask.distributed import Client import dask.dataframe as dd import os profile_name = namespace_name
custom_spec = make_cluster_spec(name=profile_name, image='ghcr.io/dask/dask:latest', resources={"requests": {"memory": "512Mi"}, "limits": {"cpu": "4","memory": "8Gi"}})
custom_spec['spec']['scheduler']['spec']['serviceAccount'] = 'default-editor' custom_spec['spec']['worker']['spec']['serviceAccount'] = 'default-editor'
custom_worker_spec = make_worker_spec(image='ghcr.io/dask/dask:latest', n_workers=6, resources={"requests": {"memory": "512Mi"}, "limits": {"memory": "12Gi"}}) custom_worker_spec['spec']['serviceAccount'] = 'default-editor' custom_worker_spec cluster = KubeCluster(custom_cluster_spec=custom_spec, n_workers=0) cluster.add_worker_group(name='highmem', custom_spec=custom_worker_spec) ```
As mentioned, let's assume that I have AWS NLB type LoadBalancer/Ingress Service. Then the Dask Client is able to successfully interact against 8787 and 8786 ports on the scheduler in order to manage the workers and jobs, externally.
```python
import dask; from dask.distributed import Client dask.config.set({'scheduler-address': 'tcp://nlb-address.region.elb.amazonaws.com:8786'}) client = Client() ```
Anything else we need to know?:
Another thing noticed was that the dask-extension relies on testDaskDashboard
function to pick up the URL info (defined in https://github.com/dask/dask-labextension/blob/main/src/dashboard.tsx#L588),
In the console, I can see,
Found the dashboard link at 'http://internal-scheduler.namespace:8787/status'
However, the consequent dashboard-check request to the backend is replacing an extra /
from protocol.
See the GET request below,
GET https://website/notebook/internal/test-dask-1/dask/dashboard-check/http%3A%2Finternal-scheduler.namespace%3A8787%2F?1673363416491
To be a bit more verbose,
http%3A%2Finternal-scheduler.namespace%3A8787%2F?1673363416491
translates to http:/internal-scheduler.namespace:8787/?1673363416491
I am not sure if this is expected or a bug.
Environment:
Recently we added a new button for launching a default set of dashboard plots from the Client
HTML repr :tada: (xref https://github.com/dask/dask-labextension/pull/248). When using the button today I noticed I needed to press the button twice to have the dashboard plots launched
cc @ian-r-rose
I started a new cluster hosted on http://127.0.01/:33459 using the +NEW button, and it appropriately shows the many options for managing/monitoring the cluster. If I try to search for http://127.0.0.1/33459
using the search bar at the top, nothing comes up. It instead connects to dask/dashboard/511c94dc-3b49-4f8b-95be-f30f959a41aa
. Is there some network proxy I should be aware of?
Also, as a side-note question, can dask_cuda.LocalCUDACluster
be used to integrate GPUs in the cluster when the module is added to the .yml file for cluster customization?
What happened:
We are running a jupyterhub for HPC users. I noticed when opening jupyterlab that the Dask Dashboard is activated and shows a running cluster on port 8787
even if I didn't start a cluster before. After checking who is using that port it is actually a different user.
How can that be possible? the should not be possible right (security perspective)?
What you expected to happen:
Anything else we need to know?:
Environment:
This is a minor release which contains a fix for dashboard URL construction logic in the presence of query parameters. Now query parameters reported by a Cluster
implementation in the dashboard URL are correctly propagated to individual dashboard panes. This is relevant for cases where things like authentication tokens are included in a URL. Thanks @ntabris for the contribution in #258.
This is a major release that includes:
This minor release for dask-labextension
includes a few new features and bugfixes.
browserDashboardCheck
which can force the extension to check for a Dask dashboard using the frontend browser session (rather than a request made on the server side). This can be useful in cases where a browser cookie is needed to authenticate to the dashboard. Thanks to new contributor @viniciusdc for implementing the feature!This is release for dask-labextension
contains a few bugfixes and minor enhancements:
Contains a fix for an issue in which continued polling of the dask dashboard could result in unbounded memory increases on the scheduler.
This is the first release of dask-labextension that supports JupyterLab 3.0. There are no changes in functionality, but installation should be significantly easier. No more nodejs, no more rebuilding the application. You can install with just pip:
pip install jupyterlab
pip install dask-labextension