Someone once told me that the night is dark and full of terrors. And tonight I am no knight. Tonight I am Davos the smuggler again. Would that you were an onion.
The davos
library provides Python with an additional keyword: smuggle
.
The smuggle
statement works just like the built-in import
statement, with two major differences:
1. You can smuggle
a package without installing it first
2. You can smuggle
a specific version of a package
import
?In many cases, smuggle
and import
do the same thing—if you're running code in the same environment you
developed it in. But what if you want to share a Jupyter notebook containing your code with
someone else? If the user (i.e., the "someone else" in this example) doesn't have all of the packages your notebook
imports, Python will raise an exception and the code won't run. It's not a huge deal, of course, but it's inconvenient
(e.g., the user might need to pip
-install the missing packages, restart their kernel, re-run the code up to the point
it crashed, etc.—possibly going through this cycle multiple times until the thing finally runs).
A second (and more subtle) issue arises when the developer (i.e., the person who wrote the code) used or assumed
different versions of the imported packages than what the user has installed in their environment. So maybe the
original author was developing and testing their code using pandas
1.3.5, but the user hasn't upgraded their pandas
installation since 0.25.0. Python will happily "import pandas
" in both cases, but any changes across those versions
might change what the developer's code actually does in the user's (different) environment—or cause it to fail
altogether.
The problem davos
tries to solve is similar to the idea motivating virtual environments, containers, and virtual
machines: we want a way of replicating the original developer's environment on the user's machine, to a sufficiently
good approximation that we can be "reasonably confident" that the code will continue to behave as expected.
When you smuggle
packages instead of importing them, it guarantees (for whatever environment the code is running in)
that the packages are importable, even if they hadn't been installed previously. Under the hood, davos
figures out
whether the package is available, and if not, it uses pip
to download and install anything that's missing (including
missing dependencies). From that point, after having automatically handled those sorts of dependency issues, smuggle
behaves just like import
.
The second powerful feature of davos
comes from another construct, called "onion comments."
These are like standard Python comments, but they appear on the same line(s) as smuggle
statements, and they are
formatted in a particular way. Onion comments provide a way of precisely controlling how, when, and where packages are
installed, how (or if) the system checks for existing installations, and so on. A key feature is the ability to specify
exactly which version(s) of each package are imported into the current workspace. When used in this way, davos
enables authors to guarantee that the same versions of the packages they developed their code with will also be imported
into the user's workspace at the appropriate times.
You can! In fact, davos
works great when used inside of virtual environments, containers, and virtual machines.
There are a few specific advantages to davos
, however:
- davos
is very lightweight—importing davos
into a notebook-based environment unlocks all of its
functionality without needed to install, set up, and learn how to use additional stuff. There is none of the
typical overhead of setting up a new virtual environment (or container, virtual machine, etc.), installing
third-party tools, writing and sharing configuration files, and so on. All of your code and its dependencies may
be contained in a single notebook file.
- using onion comments, davos
can enable mutliple versions of the same package to be used or specified in different
parts of the same notebook. Want to use some deprecated or removed function in scikit-learn
in one cell, but then
use one of the latest features in another? You can! Just add onion comments specifying which versions of the
package you want to smuggle
in which cells of your notebook.
To turn a standard Jupyter (IPython) notebook, including a Google Colaboratory notebook, into a davos
-enhanced notebook, just add two lines to the first cell:
python
%pip install davos
import davos
This will enable the smuggle
keyword in your notebook environment. Then you can do things like:
```python
smuggle numpy as np # pip: numpy==1.20.2
arr = np.arange(15).reshape(3, 5)
assert np.version == '1.20.2' ```
Interested? Curious? Intrigued? Check out the table of contents for more details! You may also want to check out our paper for more formal descriptions and explanations.
smuggle
Statement
davos
Config
davos
Parsersh
pip install git+https://github.com/ContextLab/davos.git
To use davos
in Google Colab, add a cell at the top of your notebook with an
percentage sign (%
) followed by one of the commands above (e.g., %pip install davos
). Run the cell to install
davos
on the runtime virtual machine.
Note: restarting the Colab runtime does not affect installed packages. However, if the runtime is "factory reset"
or disconnected due to reaching its idle timeout limit, you'll need to rerun the cell to reinstall davos
on the fresh
VM instance.
The primary way to use davos
is via the smuggle
statement, which is made available
simply by running import davos
. Like
the built-in import
statement, the smuggle
statement is used to
load packages, modules, and other objects into the current namespace. The main difference between the two is in how
they handle missing packages and specific package versions.
import
requires that packages be installed before the start of the interpreter session. Trying to import
a package
that can't be found locally will throw a
ModuleNotFoundError
, and you'll have to
install the package from the command line, restart the Python interpreter to make the new package importable, and rerun
your code in full in order to use it.
The smuggle
statement, however, can handle missing packages on the fly. If you smuggle
a package that isn't
installed locally, davos
will install it for you, make its contents available to Python's
import machinery, and load it into the namespace for immediate use.
You can control how davos
installs missing packages by adding a special type of inline comment called an
"onion" comment next to a smuggle
statement.
One simple but powerful use for onion comments is making smuggle
statements version-sensitive.
Python doesn't provide a native, viable way to ensure a third-party package imported at runtime matches a specific
version or satisfies a particular version constraint.
Many packages expose their version info via a top-level __version__
attribute (see
PEP 396), and certain tools (such as the standard library's
importlib.metadata
and
setuptools
's
pkg_resources
) attempt to parse version info from
installed distributions. However, using these to constrain imported package would require writing extra code to compare
version strings and still manually installing the desired version and restarting the interpreter any time an
invalid version is caught.
Additionally, for packages installed through a version control system (e.g., git), this would be insensitive to differences between revisions (e.g., commits) within the same semantic version.
davos
solves these issues by allowing you to specify a specific version or set of acceptable versions for each
smuggled package. To do this, simply provide a
version specifier in an
onion comment next to the smuggle
statement:
python
smuggle numpy as np # pip: numpy==1.20.2
from pandas smuggle DataFrame # pip: pandas>=0.23,<1.0
In this example, the first line will load numpy
into the local namespace under the alias "np
",
just as "import numpy as np
" would. First, davos
will check whether numpy
is installed locally, and if so, whether
the installed version exactly matches 1.20.2
. If numpy
is not installed, or the installed version is anything
other than 1.20.2
, davos
will use the specified installer program, pip
, to
install numpy==1.20.2
before loading the package.
Similarly, the second line will load the "DataFrame
" object from the pandas
library,
analogously to "from pandas import DataFrame
". A local pandas
version of 0.24.1
would be used, but a local version
of 1.0.2
would cause davos
to replace it with a valid pandas
version, as if you had manually run pip install
pandas>=0.23,<1.0
.
In both cases, the imported versions will fit the constraints specified in their onion comments,
and the next time numpy
or pandas
is smuggled with the same constraints, valid local installations will be found.
You can also force the state of a smuggled packages to match a specific VCS ref (branch, revision, tag, release, etc.).
For example:
python
smuggle hypertools as hyp # pip: git+https://github.com/ContextLab/[email protected]
will load hypertools
(aliased as "hyp
"), as the package existed
on GitHub, at commit
98a3d80. The general format for VCS references in
onion comments follows that of the
pip-install
command. See the
notes on smuggling from VCS below for additional info.
And with a few exceptions, smuggling a specific package version will work even if the package has already been imported!
Note: davos
v0.1 supports IPython environments (e.g.,
Jupyter and Colaboratory notebooks) only. v0.2 will add
support for "regular" (i.e., non-interactive) Python scripts.
Different versions of the same package can often behave quite differently—bugs are introduced and fixed, features are implemented and removed, support for Python versions is added and dropped, etc. Because of this, Python code that is meant to be reproducible (e.g., tutorials, demos, data analyses) is commonly shared alongside a set of fixed versions for each package used. And since there is no Python-native way to specify package versions at runtime (see above), this typically takes the form of a pre-configured development environment the end user must build themselves (e.g., a Docker container or conda environment), which can be cumbersome, slow to set up, resource-intensive, and confusing for newer users, as well as require shipping both additional specification files and setup instructions along with your code. And even then, a well-intentioned user may alter the environment in a way that affects your carefully curated set of pinned packages (such as installing additional packages that trigger dependency updates).
Instead, davos
allows you to share code with one simple instruction: just pip install davos
! Replace your import
statements with smuggle
statements, pin package versions in onion comments, and let davos
take care of the rest.
Beyond its simplicity, this approach ensures your predetermined package versions are in place every time your code is
run.
If you want to make sure you're always using the most recent release of a certain package, davos
makes doing so easy:
python
smuggle mypkg # pip: mypkg --upgrade
Or if you have an automation designed to test your most recent commit on GitHub:
python
smuggle mypkg # pip: git+https://username/reponame.git
The ability to smuggle
a specific package version even after a different version has been imported makes davos
a
useful tool for comparing behavior across multiple versions of the same package, within the same interpreter session:
``python
def test_my_func_unchanged():
"""Regression test for
mypkg.my_func()`"""
data = list(range(10))
smuggle mypkg # pip: mypkg==0.1
result1 = mypkg.my_func(data)
smuggle mypkg # pip: mypkg==0.2
result2 = mypkg.my_func(data)
smuggle mypkg # pip: git+https://github.com/MyOrg/mypkg.git
result3 = mypkg.my_func(data)
assert result1 == result2 == result3
```
smuggle
StatementThe smuggle
statement is meant to be used in place of
the built-in import
statement and shares
its full syntactic definition:
ebnf
smuggle_stmt ::= "smuggle" module ["as" identifier] ("," module ["as" identifier])*
| "from" relative_module "smuggle" identifier ["as" identifier]
("," identifier ["as" identifier])*
| "from" relative_module "smuggle" "(" identifier ["as" identifier]
("," identifier ["as" identifier])* [","] ")"
| "from" module "smuggle" "*"
module ::= (identifier ".")* identifier
relative_module ::= "."* module | "."+
NB: uses the modified BNF grammar notation described in
The Python Language Reference,
here; see
here for the lexical definition
of identifier
In simpler terms, any valid syntax for import
is also valid for smuggle
.
import
statements, smuggle
statements are whitespace-insensitive, unless a lack of whitespace between two
tokens would cause them to be interpreted as a different token:
python
from os.path smuggle dirname, join as opj # valid
from os . path smuggle dirname ,join as opj # also valid
from os.path smuggle dirname, join asopj # invalid ("asopj" != "as opj")
import
statement not to be executed will have the same effect on a smuggle
statement:
python
# smuggle matplotlib.pyplot as plt # not executed
print('smuggle matplotlib.pyplot as plt') # not executed
foo = """
smuggle matplotlib.pyplot as plt""" # not executed
davos
parser is less complex than the full Python parser, there are two fairly non-disruptive edge
cases where an import
statement would be syntactically valid but a smuggle
statement would not:python
exec('from pathlib import Path') # executed
exec('from pathlib smuggle Path') # raises SyntaxError
A one-line compound statement clause: ```python if True: import random # executed if True: smuggle random # raises SyntaxError
while True: import math; break # executed while True: smuggle math; break # raises SyntaxError
for _ in range(1): import json # executed for _ in range(1): smuggle json # raises SyntaxError
# etc...
- In [IPython](https://ipython.readthedocs.io/en/stable/) environments (e.g., [Jupyter](https://jupyter.org/) &
[Colaboratory](https://colab.research.google.com/) notebooks) `smuggle` statements always load names into the global
namespace:
python
# example.ipynb
import davos
def import_example(): import datetime
def smuggle_example(): smuggle datetime
import_example() type(datetime) # raises NameError
smuggle_example() type(datetime) # returns ```
An onion comment is a special type of inline comment placed on a line containing a smuggle
statement. Onion comments
can be used to control how davos
:
1. determines whether the smuggled package should be installed
2. installs the smuggled package, if necessary
Onion comments are also useful when smuggling a package whose distribution name (i.e., the name
used when installing it) is different from its top-level module name (i.e., the name used when importing it). Take for
example:
python
from sklearn.decomposition smuggle pca # pip: scikit-learn
The onion comment here (# pip: scikit-learn
) tells davos
that if "sklearn
" does not exist
locally, the "scikit-learn
" package should be installed.
Onion comments follow a simple but specific syntax, inspired in part by the
type comment syntax introduced in
PEP 484. The following is a loose (pseudo-)syntactic definition for an onion
comment:
ebnf
onion_comment ::= "#" installer ":" install_opt* pkg_spec install_opt*
installer ::= ("pip" | "conda")
pkg_spec ::= identifier [version_spec]
NB: uses the modified BNF grammar notation described in
The Python Language Reference,
here; see
here for the lexical definition
of identifier
where installer
is the program used to install the package; install_opt
is any option accepted by the installer's
"install
" command; and version_spec
may be a
version specifier defined by
PEP 440 followed by a
version string, or an alternative syntax valid
for the given installer
program. For example, pip
uses specific syntaxes for
local,
editable, and
VCS-based installation.
Less formally, an onion comment simply consists of two parts, separated by a colon:
1. the name of the installer program (e.g., pip
)
2. arguments passed to the program's "install" command
Thus, you can essentially think of writing an onion comment as taking the full shell command you would run to install
the package, and replacing "install" with ":". For instance, the command:
sh
pip install -I --no-cache-dir numpy==1.20.2 -vvv --timeout 30
is easily translated into an onion comment as:
python
smuggle numpy # pip: -I --no-cache-dir numpy==1.20.2 -vvv --timeout 30
In practice, onion comments are identified as matches for the regular expression: ```regex
```
NB: support for installing smuggle
d packages via
conda
will be added in v0.2. For v0.1,
"pip
" should be used exclusively.
Note: support for installing smuggled packages via the conda
package manager
will be added in v0.2. For v0.1, onion comments should always specify "pip
" as the installer
program.
smuggle
statement; otherwise, it is not parsed:
```python
# assuming the dateutil package is not installed...# pip: python-dateutil # <-- has no effect smuggle dateutil # raises InstallerError (no "dateutil" package exists)
smuggle dateutil # raises InstallerError (no "dateutil" package exists) # pip: python-dateutil # <-- has no effect
smuggle dateutil # pip: python-dateutil # installs "python-dateutil" package, if necessary
- An onion comment may be followed by unrelated inline comments as long as they are separated by at least one space:
python
smuggle tqdm # pip: tqdm>=4.46,<4.60 # this comment is ignored
smuggle tqdm # pip: tqdm>=4.46,<4.60 # so is this one
smuggle tqdm # pip: tqdm>=4.46,<4.60# but this comment raises OnionArgumentError
- An onion comment must be the first inline comment immediately following a `smuggle` statement; otherwise, it is not
parsed:
python
smuggle numpy # pip: numpy!=1.19.1 # <-- guarantees smuggled version is not v1.19.1
smuggle numpy # has no effect --> # pip: numpy==1.19.1
This also allows you to easily "comment out" onion comments:
python
smuggle numpy ## pip: numpy!=1.19.1 # <-- has no effect
- Onion comments are generally whitespace-insensitive, but installer arguments must be separated by at least one space:
python
from umap smuggle UMAP # pip: umap-learn --user -v --no-clean # valid
from umap smuggle UMAP#pip:umap-learn --user -v --no-clean # also valid
from umap smuggle UMAP # pip: umap-learn --user-v--no-clean # raises OnionArgumentError
- Onion comments have no effect on standard library modules:
python
smuggle threading # pip: threading==9999 # <-- has no effect
- When smuggling multiple packages with a _single_ `smuggle` statement, an onion comment may be used to refer to the
**first** package listed:
python
smuggle nilearn, nibabel, nltools # pip: nilearn==0.7.1
- If multiple _separate_ `smuggle` statements are placed on a single line, an onion comment may be used to refer to the
**last** statement:
python
smuggle gensim; smuggle spacy; smuggle nltk # pip: nltk~=3.5 --pre
- For multiline `smuggle` statements, an onion comment may be placed on the first line:
python
from scipy.interpolate smuggle ( # pip: scipy==1.6.3
interp1d,
interpn as interp_ndgrid,
LinearNDInterpolator,
NearestNDInterpolator,
)
... or on the last line:
python
from scipy.interpolate smuggle (interp1d, # this comment has no effect
interpn as interp_ndgrid,
LinearNDInterpolator,
NearestNDInterpolator) # pip: scipy==1.6.3
... though the first line takes priority:
python
from scipy.interpolate smuggle ( # pip: scipy==1.6.3 # <-- this version is installed
interp1d,
interpn as interp_ndgrid,
LinearNDInterpolator,
NearestNDInterpolator,
) # pip: scipy==1.6.2 # <-- this comment is ignored
... and all comments _not_ on the first or last line are ignored:
python
from scipy.interpolate smuggle (
interp1d, # pip: scipy==1.6.3 # <-- ignored
interpn as interp_ndgrid,
LinearNDInterpolator, # unrelated comment # <-- ignored
NearestNDInterpolator
) # pip: scipy==1.6.2 # <-- parsed
``
- The onion comment is intended to describe how a specific smuggled package should be installed if it is not found
locally, in order to make it available for immediate use. Therefore, installer options that either (A) install
packages other than the smuggled package and its dependencies (e.g., from a specification file), or (B) cause the
smuggled package not to be installed, are disallowed. The options listed below will raise an
OnionArgumentError:
-
-h,
--help-
-r,
--requirement-
-V,
--version`
davos
ConfigThe davos
config object stores options and data that affect how davos
behaves. After importing davos
, the config
instance (a singleton) for the current session is available as davos.config
, and its various fields are accessible as
attributes. The config object exposes a mixture of writable and read-only fields. Most davos.config
attributes can be
assigned values to control aspects of davos
behavior, while others are available for inspection but are set and used
internally. Additionally, certain config fields may be writable in some situations but not others (e.g. only if the
importing environment supports a particular feature). Once set, davos
config options last for the lifetime of the
interpreter (unless updated); however, they do not persist across interpreter sessions. A full list of davos
config
fields is available below:
| Field | Description | Type | Default | Writable? |
| :---: | --- | :---: | :---: | :---: |
| active
| Whether or not the davos
parser should be run on subsequent input (cells, in Jupyter/Colab notebooks). Setting to True
activates the davos
parser, enables the smuggle
keyword, and injects the smuggle()
function into the user namespace. Setting to False
deactivates the davos
parser, disables the smuggle
keyword, and removes "smuggle
" from the user namespace (if it holds a reference to the smuggle()
function). See How it Works for more info. | bool
| True
| ✅ |
| auto_rerun
| If True
, when smuggling a previously-imported package that cannot be reloaded (see Smuggling packages with C-extensions), davos
will automatically restart the interpreter and rerun all code up to (and including) the current smuggle
statement. Otherwise, issues a warning and prompts the user with buttons to either restart/rerun or continue running. | bool
| False
| ✅ (Jupyter notebooks only) |
| confirm_install
| Whether or not davos
should require user confirmation ([y/n]
input) before installing a smuggled package | bool
| False
| ✅ |
| environment
| A label describing the environment into which davos
was running. Checked internally to determine which interchangeable implementation functions are used, whether certain config fields are writable, and various other behaviors | Literal['Python', 'IPython<7.0', 'IPython>=7.0', 'Colaboratory']
| N/A | ❌ |
| ipython_shell
| The global IPython interactive shell instance | IPython.core
.interactiveshell
.InteractiveShell
| N/A | ❌ |
| noninteractive
| Set to True
to run davos
in non-interactive mode (all user input and confirmation will be disabled). NB:
1. Setting to True
disables confirm_install
if previously enabled
2. If auto_rerun
is False
in non-interactive mode, davos
will throw an error if a smuggled package cannot be reloaded | bool
| False
| ✅ (Jupyter notebooks only) |
| pip_executable
| The path to the pip
executable used to install smuggled packages. Must be a path (str
or pathlib.Path
) to a real file. Default is programmatically determined from Python environment; falls back to sys.executable -m pip
if executable can't be found | str
| pip
exe path or sys.executable -m pip
| ✅ |
| smuggled
| A cache of packages smuggled during the current interpreter session. Formatted as a dict
whose keys are package names and values are the (.split()
and ';'.join()
ed) onion comments. Implemented this way so that any non-whitespace change to installer arguments re-installation | dict[str, str]
| {}
| ❌ |
| suppress_stdout
| If True
, suppress all unnecessary output issued by both davos
and the installer program. Useful when smuggling packages that need to install many dependencies and therefore generate extensive output. If the installer program throws an error while output is suppressed, both stdout & stderr will be shown with the traceback | bool
| False
| ✅ |
davos
also provides a few convenience for reading/setting config values:
- davos.activate()
Activate the davos
parser, enable the smuggle
keyword, and inject the smuggle()
function into the namespace.
Equivalent to setting davos.config.active = True
. See How it Works for more info.
davos.deactivate()
Deactivate the davos
parser, disable the smuggle
keyword, and remove the name smuggle
from the namespace if (and
only if) it refers to the smuggle()
function. If smuggle
has been overwritten with a different value, the variable
will not be deleted. Equivalent to setting davos.config.active = False
. See How it Works for moreinfo.
davos.is_active()
Return the current value of davos.config.active
.
davos.configure(**kwargs)
Set multiple davos.config
fields at once by passing values as keyword arguments, e.g.:
python
import davos
davos.configure(active=False, noninteractive=True, pip_executable='/usr/bin/pip3')
is equivalent to:
python
import davos
davos.active = False
davos.noninteractive = True
davos.pip_executable = '/usr/bin/pip3'
davos
ParserFunctionally, importing davos
appears to enable a new Python keyword, "smuggle
". However, davos
doesn't actually
modify the rules or reserved keywords used by
Python's parser and lexical analyzer in order to do so—in fact, modifying the Python grammar is not possible at
runtime and would require rebuilding the interpreter. Instead, in IPython
enivonments like Jupyter and
Colaboratory notebooks, davos
implements the smuggle
keyword via a combination of namespace injections and its own (far simpler) custom parser.
The smuggle
keyword can be enabled and disabled at will by "activating" and "deactivating" davos
(see the
davos
Config Reference and Top-level Functions, above). When davos
is
imported, it is automatically activated by default. Activating davos
triggers two things:
1. The smuggle()
function is injected into the IPython
user namespace
2. The davos
parser is registered as a
custom input transformer
IPython preprocesses all executed code as plain text before it is sent to the Python parser in order to handle
special constructs like %magic
and
!shell
commands. davos
hooks into this process to transform smuggle
statements into syntactically valid Python code. The davos
parser uses this regular expression to match each
line of code containing a smuggle
statement (and, optionally, an onion comment), extracts information from its text,
and replaces it with an analogous call to the smuggle()
function. Thus, even though the code visible to the user may
contain smuggle
statements, e.g.:
python
smuggle numpy as np # pip: numpy>1.16,<=1.20 -vv
the code that is actually executed by the Python interpreter will not:
python
smuggle(name="numpy", as_="np", installer="pip", args_str="""numpy>1.16,<=1.20 -vv""", installer_kwargs={'editable': False, 'spec': 'numpy>1.16,<=1.20', 'verbosity': 2})
The davos
parser can be deactivated at any time, and doing so triggers the opposite actions of activating it:
1. The name "smuggle
" is deleted from the IPython
user namespace, unless it has been overwritten and no longer
refers to the smuggle()
function
2. The davos
parser input transformer is deregistered.
Note: in Jupyter and Colaboratory notebooks, IPython parses and transforms all text in a cell before sending it
to the kernel for execution. This means that importing or activating davos
will not make the smuggle
statement
available until the next cell, because all lines in the current cell were transformed before the davos
parser was
registered. However, deactivating davos
disables the smuggle
statement immediately—although the davos
parser will have already replaced all smuggle
statements with smuggle()
function calls, removing the function from
the namespace causes them to throw NameError
.
The davos
parser extracts info from onion comments by passing them to a (slightly modified) reimplementation of
their specified installer program's CLI parser. This is somewhat redundant, since the arguments will eventually be
re-parsed by the actual installer program if the package needs to be installed. However, it affords a number of
advantages, such as:
- detecting errors early during the parser phase, before spending any time running code above the line containing the
smuggle
statement
- preventing shell injections in onion comments—e.g., #pip: --upgrade numpy && rm -rf /
fails due to the
OnionParser
, but would otherwise execute successfully.
- allowing certain installer arguments to temporarily influence davos
behavior while smuggling the current package
(see Installer options that affect davos
behavior below for specific info)
Passing certain options to the installer program via an onion comment will also affect the
corresponding smuggle
statement in a predictable way:
--force-reinstall
|
-I
, --ignore-installed
|
-U
, --upgrade
The package will be installed, even if it exists locally
Disables input prompts, analogous to temporarily setting davos.config.noninteractive
to True
. Overrides value
of davos.config.confirm_install
.
--src <dir>
|
-t
, --target <dir>
Prepends <dir>
to sys.path
if not already present so
the package can be imported.
Some Python packages that rely heavily on custom data types implemented via
C-extensions (e.g., numpy
, pandas
) dynamically generate
modules defining various C functions and data structures, and link them to the Python interpreter when they are first
imported. Depending on how these objects are initialized, they may not be subject to normal garbage collection, and
persist despite their reference count dropping to zero. This can lead to unexpected errors when reloading the Python
module that creates them, particularly if their dynamically generated source code has been changed (e.g., because the
reloaded package is a newer version).
This can occasionally affect davos
's ability to smuggle
a new version of a package (or dependency) that was
previously imported. To handle this, davos
first checks each package it installs against
sys.modules
. If a different version has already been
loaded by the interpreter, davos
will attempt to replace it with the requested version. If this fails, davos
will
restore the old package version in memory, while replacing it with the new package version on disk. This allows
subsequent code that uses the non-reloadable module to still execute in most cases, while dependency checks for other
packages run against the updated version. Then, depending on the value of davos.config.auto_rerun
, davos
will
either either automatically restart the interpreter to load the updated package, prompt you to do so, or raise an
exception.
The Python docs for importlib.reload()
include
the following caveat:
If a module imports objects from another module using
from
…import
…, callingreload()
for the other module does not redefine the objects imported from it — one way around this is to re-execute thefrom
statement, another is to useimport
and qualified names (module.name) instead.
The same applies to smuggling packages or modules from which objects have already been loaded. If object name
from
module module
was loaded using either from module import name
or from module smuggle name
, subsequently
running smuggle module # pip --upgrade
will in fact install and load an upgraded version of module
, but the
the name
object will still be that of the old version! To fix this, you can simply run from module smuggle
name
either instead in lieu of or after smuggle module
.
The first time during an interpreter session that a given package is installed from a VCS URL, it is assumed not to be
present locally, and is therefore freshly installed. pip
clones non-editable VCS repositories into a temporary
directory, runs setup.py install
, and then immediately deletes them. Since no information is retained about the
state of the repository at installation, it is impossible to determine whether an existing package satisfies the state
(i.e., branch, tag, commit hash, etc.) requested for smuggled package.
I was originally thinking implementing smuggling via conda would work similar to how it does for pip
, where we could simply replace the conda
executable that'd normally be looked up in $PATH
with the full path to the given environment's exe. But looking at different conda versions and distributions, there usually isn't a separate conda
executable, which means we'll need to implement this more like how IPython does so, by calling the base environment's conda
executable and passing the particular environment's --prefix
when appropriate.
Looking at what we'll want to lazily evaluate, store, make editable vs read-only, etc.), it quickly became clear that the most sensible implementation of this would require a good bit of refactoring across a number of modules. So for the initial release I'm leaving this as-is and just patching davos.implementations.ipython_common
to fix #53, since conda isn't supported by v0.1.0 anyway. When we eventually want to support conda, I'll circle back to this more serious refactor.
When the same smuggle
command is multiple times in the same interpreter session, davos
avoids re-checking the file system for a local, satisfactory package version, unless something about the command has been changed since it was last run (e.g., the requested version, installer program, VCS URL, etc.). This is done by storing a dict item for each smuggled package in davos.config.smuggled
, where the key is the package name and the value is a string made by ';'.join()
ing the name of the installer program and all arguments supplied in the onion comment (no onion comment results in just <installer_name>;
).
This is quick and simple, but ideally it would treat variations in argument order & short/long form as matches.
i.e., running:
python
smuggle foo # pip: -v foo==0.0.1
stores:
python
davos.config.smuggled['foo'] = 'pip;-v;foo==0.0.1'
However, neither of the following would be recognized as previously run, despite being equivalent to the existing cached value:
```python
smuggle foo # pip: foo==0.0.1 -v # new value would be 'pip;foo==0.0.1;-v'
smuggle foo # pip: --verbose foo==0.0.1 # new value would be 'pip;--verbose;foo==0.0.1' ```
would be nice to also make davos
conda install
-able via the conda-forge
channel. Never done it before, but doesn't seem too difficult to do
requires implementing:
- [ ] parser object analogous to the pip parser in davos/parsers.py
- [ ] Davos
method for conda installation (placeholder currently exists)
- [x] initial check for whether conda
executable exists with cached result
- [ ] updating some examples in the docs to show using conda
as the installer
Related to #7
Pip and conda both occasionally deprecate or drop support certain arguments and add new ones. On the one hand, it'd probably be nice in terms of "least astonishment" for the Onion parser to support all args that the user's local pip/conda executable supports. On the other hand, it could potentially hurt code portability if one of the installers decides to majorly change something at some point.
If we do want to do this, I think these changes are infrequent enough that we could just hard code everything and set up the parsers based on pip.__version__
, etc. We could also parse the output of pip install --help
to gather args and help messages on import, but that would be less efficient I think.
davos
can be used to simplify and enhance sharing reproducible research code!smuggle
statements that install packages into alternate Python environments (by changing davos.config.pip_executable
) now successfully import the package in addition to installing it davos
's non-interactive mode now ensures the installer program doesn't request user input (in addition to davos
itself)selenium
changesFull Changelog: https://github.com/ContextLab/davos/compare/v0.1.0...v0.1.1
Initial release! See the README for documentation.
Contextual Dynamics Laboratory at Dartmouth College
GitHub Repositorypython import jupyter reproducibility package-management environment-management pip install ipython google-colab