pySCENIC is a lightning-fast python implementation of the SCENIC pipeline (Single-Cell rEgulatory Network Inference and Clustering) which enables biologists to infer transcription factors, gene regulatory networks and cell types from single-cell RNA-seq data.

aertslab, updated 🕥 2022-11-21 13:16:07

pySCENIC

|buildstatus| |pypipackage| |docstatus|_

pySCENIC is a lightning-fast python implementation of the SCENIC_ pipeline (Single-Cell rEgulatory Network Inference and Clustering) which enables biologists to infer transcription factors, gene regulatory networks and cell types from single-cell RNA-seq data.

The pioneering work was done in R and results were published in Nature Methods [1]. A new and comprehensive description of this Python implementation of the SCENIC pipeline is available in Nature Protocols [4].

pySCENIC can be run on a single desktop machine but easily scales to multi-core clusters to analyze thousands of cells in no time. The latter is achieved via the dask_ framework for distributed computing [2]_.

Full documentation for pySCENIC is available on Read the Docs <https://pyscenic.readthedocs.io/en/latest/>_


pySCENIC is part of the SCENIC Suite of tools! See the main SCENIC website <https://scenic.aertslab.org/>_ for additional information and a full list of tools available.


News and releases

0.12.1 | 2022-11-21 ^^^^^^^^^^^^^^^^^^^

  • Add support for running arboreto_with_multiprocessing.py with spawn instead of fork as multiprocessing method.Pool
  • Use ravel instead of flatten to avoid unnecessary memory copy in aucell
  • Update Docker image file and add separated Docker file for pySCENIC with scanpy.

0.12.0 | 2022-08-16 ^^^^^^^^^^^^^^^^^^^

  • Only databases in Feather v2 format are supported now (ctxcore <https://github.com/aertslab/ctxcore>_ >= 0.2), which allow uses recent versions of pyarrow (>=8.0.0) instead of very old ones (<0.17). Databases in the new format can be downloaded from https://resources.aertslab.org/cistarget/databases/ and end with *.genes_vs_motifs.rankings.feather or *.genes_vs_tracks.rankings.feather.
  • Support clustered motif databases.
  • Use custom multiprocessing instead of dask, by default.
  • Docker image uses python 3.10 and contains only needed pySCENIC dependencies for CLI usage.
  • Remove unneeded scripts and notebooks for unused/deprecated database formats.

0.11.2 | 2021-05-07 ^^^^^^^^^^^^^^^^^^^

  • Split some core cisTarget functions out into a separate repository, ctxcore <https://github.com/aertslab/ctxcore>_. This is now a required package for pySCENIC.

0.11.1 | 2021-02-11 ^^^^^^^^^^^^^^^^^^^

  • Fix bug in motif url construction (#275)
  • Fix for export2loom with sparse dataframe (#278)
  • Fix sklearn t-SNE import (#285)
  • Updates to Docker image (expose port 8787 for Dask dashboard)

0.11.0 | 2021-02-10 ^^^^^^^^^^^^^^^^^^^

Major features:

  • Updated arboreto_ release (GRN inference step) includes:

  • Support for sparse matrices (using the --sparse flag in pyscenic grn, or passing a sparse matrix to grnboost2/genie3).

  • Fixes to avoid dask metadata mismatch error

  • Updated cisTarget:

  • Fix for metadata mismatch in ctx prune2df step

  • Support for databases Apache Parquet format
  • Faster loading from feather databases
  • Bugfix: loading genes from a database (previously missing the last gene name in the database)

  • Support for Anndata input and output

  • Package updates:

  • Upgrade to newer pandas version

  • Upgrade to newer numba version
  • Upgrade to newer versions of dask, distributed

  • Input checks and more descriptive error messages.

  • Check that regulons loaded are not empty.

  • Bugfixes:

  • In the regulons output from the cisTarget step, the gene weights were incorrectly assigned to their respective target genes (PR #254).

  • Motif url construction fixed when running ctx without pruning
  • Compression of intermediate files in the CLI steps
  • Handle loom files with non-standard gene/cell attribute names
  • Reformat the genesig gmt input/output
  • Fix AUCell output to loom with non-standard loom attributes

0.10.4 | 2020-11-24 ^^^^^^^^^^^^^^^^^^^

  • Included new CLI option to add correlation information to the GRN adjacencies file. This can be called with pyscenic add_cor.

See also the extended Release Notes <https://pyscenic.readthedocs.io/en/latest/releasenotes.html>_.

Overview

The pipeline has three steps:

  1. First transcription factors (TFs) and their target genes, together defining a regulon, are derived using gene inference methods which solely rely on correlations between expression of genes across cells. The arboreto_ package is used for this step.
  2. These regulons are refined by pruning targets that do not have an enrichment for a corresponding motif of the TF effectively separating direct from indirect targets based on the presence of cis-regulatory footprints.
  3. Finally, the original cells are differentiated and clustered on the activity of these discovered regulons.

The most impactful speed improvement is introduced by the arboreto_ package in step 1. This package provides an alternative to GENIE3 [3]_ called GRNBoost2. This package can be controlled from within pySCENIC.

All the functionality of the original R implementation is available and in addition:

  1. You can leverage multi-core and multi-node clusters using dask_ and its distributed_ scheduler.
  2. We implemented a version of the recovery of input genes that takes into account weights associated with these genes.
  3. Regulons, i.e. the regulatory network that connects a TF with its target genes, with targets that are repressed are now also derived and used for cell enrichment analysis.

Additional resources

For more information, please visit LCB_, the main SCENIC website <https://scenic.aertslab.org/>, or SCENIC (R version) <https://github.com/aertslab/SCENIC>. There is a tutorial to create new cisTarget databases <https://github.com/aertslab/create_cisTarget_databases>_. The CLI to pySCENIC has also been streamlined into a pipeline that can be run with a single command, using the Nextflow workflow manager. There are two Nextflow implementations available:

  • SCENICprotocol_: A Nextflow DSL1 implementation of pySCENIC alongside a basic "best practices" expression analysis. Includes details on pySCENIC installation, usage, and downstream analysis, along with detailed tutorials.
  • VSNPipelines_: A Nextflow DSL2 implementation of pySCENIC with a comprehensive and customizable pipeline for expression analysis. Includes additional pySCENIC features (multi-runs, integrated motif- and track-based regulon pruning, loom file generation).

Acknowledgments

We are grateful to all providers of TF-annotated position weight matrices, in particular Martha Bulyk (UNIPROBE), Wyeth Wasserman and Albin Sandelin (JASPAR), BioBase (TRANSFAC), Scot Wolfe and Michael Brodsky (FlyFactorSurvey) and Timothy Hughes (cisBP).

References

.. [1] Aibar, S. et al. SCENIC: single-cell regulatory network inference and clustering. Nat Meth 14, 1083–1086 (2017). doi:10.1038/nmeth.4463 <https://doi.org/10.1038/nmeth.4463> .. [2] Rocklin, M. Dask: parallel computation with blocked algorithms and task scheduling. conference.scipy.org .. [3] Huynh-Thu, V. A. et al. Inferring regulatory networks from expression data using tree-based methods. PLoS ONE 5, (2010). doi:10.1371/journal.pone.0012776 <https://doi.org/10.1371/journal.pone.0012776> .. [4] Van de Sande B., Flerin C., et al. A scalable SCENIC workflow for single-cell gene regulatory network analysis. Nat Protoc. June 2020:1-30. doi:10.1038/s41596-020-0336-2 <https://doi.org/10.1038/s41596-020-0336-2>_

.. |buildstatus| image:: https://travis-ci.org/aertslab/pySCENIC.svg?branch=master .. _buildstatus: https://travis-ci.org/aertslab/pySCENIC

.. |pypipackage| image:: https://img.shields.io/pypi/v/pySCENIC?color=%23026aab .. _pypipackage: https://pypi.org/project/pyscenic/

.. |docstatus| image:: https://readthedocs.org/projects/pyscenic/badge/?version=latest .. _docstatus: http://pyscenic.readthedocs.io/en/latest/?badge=latest

.. SCENIC: http://scenic.aertslab.org .. _dask: https://dask.pydata.org/en/latest/ .. _distributed: https://distributed.readthedocs.io/en/latest/ .. _arboreto: https://arboreto.readthedocs.io .. _LCB: https://aertslab.org .. SCENICprotocol: https://github.com/aertslab/SCENICprotocol .. _VSNPipelines: https://github.com/vib-singlecell-nf/vsn-pipelines .. _notebooks: https://github.com/aertslab/pySCENIC/tree/master/notebooks .. _issue: https://github.com/aertslab/pySCENIC/issues/new .. _PyPI: https://pypi.python.org/pypi/pyscenic

Issues

arboreto_with_multiprocessing.py not able to use specified threads

opened on 2023-03-22 00:59:09 by MarcusLCC

Hi, thanks for developing this amazing method

I'm using pySCENIC (0.12.1) in linux system on HPC (with 80 cores and 500GB RAM). It's a conda environment where the pySCENIC is installed via pip

My expression input is a loom file ~ 1.1GB in size with 27k genes and 60k cells

I've tried the CLI version with code ``` loom_path="seurat_ds_5k.loom" tf_file="allTFs_hg38.txt" outdir="18_pyscenic/20230321_downsample5k"

pyscenic grn ${loom_path} ${tf_file} -o ${outdir}/adj.csv --num_workers 20 > ${outdir}/log 2>&1 & ``` , where I got the following warnings (though it's still running)

``` 2023-03-19 20:40:21,537 - pyscenic.cli.pyscenic - INFO - Loading expression matrix.

2023-03-19 20:44:19,332 - pyscenic.cli.pyscenic - INFO - Inferring regulatory networks. 2023-03-19 22:49:16,009 - distributed.worker - WARNING - Could not find data: {'ndarray-598b1d00ac144024c974f21a1bb7e818': ['tcp://127.0.0.1:44201', 'tcp://127.0.0.1:46424', 'tcp://127.0.0.1:41338', 'tcp://127.0.0.1:38435', 'tcp://127.0.0.1:41291', 'tcp://127.0.0.1:41161', 'tcp://127.0.0.1:37953', 'tcp://127.0.0.1:42931', 'tcp://127.0.0.1:40679']} on workers: [] (who_has: {'ndarray-598b1d00ac144024c974f21a1bb7e818': ['tcp://127.0.0.1:44201', 'tcp://127.0.0.1:46424', 'tcp://127.0.0.1:41338', 'tcp://127.0.0.1:38435', 'tcp://127.0.0.1:41291', 'tcp://127.0.0.1:41161', 'tcp://127.0.0.1:37953', 'tcp://127.0.0.1:42931', 'tcp://127.0.0.1:40679']}) 2023-03-19 22:49:16,013 - distributed.scheduler - WARNING - Worker tcp://127.0.0.1:42958 failed to acquire keys: {'ndarray-598b1d00ac144024c974f21a1bb7e818': ('tcp://127.0.0.1:44201', 'tcp://127.0.0.1:46424', 'tcp://127.0.0.1:41338', 'tcp://127.0.0.1:38435', 'tcp://127.0.0.1:41291', 'tcp://127.0.0.1:41161', 'tcp://127.0.0.1:37953', 'tcp://127.0.0.1:42931', 'tcp://127.0.0.1:40679')}` ```

I then switched to using arboreto_with_multiprocessing.py using the following code: arboreto_with_multiprocessing.py ${loom_path} ${tf_file} --method grnboost2 --output ${outdir}/adj.tsv --num_workers 15 --seed 777 > log 2>&1 &

When using monitor like htop to see the actuall cpu and memory usage, the programme actually only runs on 3 cores most of the time, and changing the --num_workers 15 to some other values like 5 or 20 doesn't actually make any difference (it still runs on ~3 cores). As the progress seems to be slow, I'm wondering if I'm doing some of my steps wrong.

May I have your advice on it? Advice on both CLI version's warning message and arboreto_with_multiprocessing.py multicores issue is much appreciated. Many thanks!

Best, Marcus

CLI ctx fails on "dask_cluster" mode[BUG]

opened on 2023-03-15 10:59:11 by carlos-a-enriquez

Describe the bug First of all, I have been running pySCENIC in a distributed cloud environment, not in an HPC environment. My issue occurs when running ctx in the "dask_cluster" mode, which is the only mode suitable for my underlying infrastructure (custom_multiprocessing or even dask_multiprocessing would not allow me to use my resources efficiently).

The source of the bug can be easily identified in this line of code: https://github.com/aertslab/pySCENIC/blob/master/src/pyscenic/cli/pyscenic.py#L243, where args.mode is passed as client_or_address to prune2df().

What is the issue with this? The issue occurs when I choose the "dask_cluster" mode, which would naturally require me to pass my Dask cluster's IP as an extra CLI argument, which would be client_or_address. However, since args.mode is, for some reason, passed as prune2df()'s client_or_address keyword argument, the "dask_cluster" string is obviously rejected since it is not a valid IP address.

The solution would be to switch args.mode with args.client_or_address in this particular line (src/pyscenic/cli/pyscenic.py#L243).

Steps to reproduce the behavior 1. Command run when the error occurred:

python !pyscenic ctx {f_adj_csv} \ {f_db_names} \ --annotations_fname {MOTIF_ANNOTATIONS_FNAME} \ --expression_mtx_fname {f_loom_path_scenic} \ --output {f_motifs_csv} \ --mask_dropouts \ --mode "dask_cluster" \ --client_or_address {dask_scheduler_IP}

  1. Error encountered:

pytb 2023-03-15 10:51:11,205 - pyscenic.cli.pyscenic - INFO - Calculating regulons. Traceback (most recent call last): File "/opt/conda/bin/pyscenic", line 8, in <module> sys.exit(main()) File "/opt/conda/lib/python3.8/site-packages/pyscenic/cli/pyscenic.py", line 713, in main args.func(args) File "/opt/conda/lib/python3.8/site-packages/pyscenic/cli/pyscenic.py", line 236, in prune_targets_command df_motifs = calc_func( File "/opt/conda/lib/python3.8/site-packages/pyscenic/prune.py", line 424, in prune2df return _distributed_calc( File "/opt/conda/lib/python3.8/site-packages/pyscenic/prune.py", line 205, in _distributed_calc assert is_valid( AssertionError: "dask_cluster"is not valid for parameter client_or_address.

Expected behavior I expect {dask_scheduler_IP} to be provided to prune2df() as the corresponding client_or_address argument, not the "dask_cluster" string.

Please complete the following information: - pySCENIC version: 0.12.1 - Installation method: Pip - Run environment: CLI through jupyter notebook. - OS: Ubuntu - Package versions: ```

packages in environment at /opt/conda:

Name Version Build Channel

_libgcc_mutex 0.1 conda_forge conda-forge _openmp_mutex 4.5 2_gnu conda-forge aiohttp 3.8.3 py38h0a891b7_1 conda-forge aiosignal 1.3.1 pyhd8ed1ab_0 conda-forge alsa-lib 1.2.8 h166bdaf_0 conda-forge anndata 0.8.0 pypi_0 pypi anyio 3.6.2 pyhd8ed1ab_0 conda-forge arboreto 0.1.6 pypi_0 pypi argon2-cffi 21.3.0 pyhd8ed1ab_0 conda-forge argon2-cffi-bindings 21.2.0 py38h0a891b7_3 conda-forge arpack 3.7.0 hdefa2d7_2 conda-forge asttokens 2.2.1 pyhd8ed1ab_0 conda-forge async-timeout 4.0.2 pyhd8ed1ab_0 conda-forge attr 2.5.1 h166bdaf_1 conda-forge attrs 22.2.0 pyh71513ae_0 conda-forge babel 2.11.0 pyhd8ed1ab_0 conda-forge backcall 0.2.0 pyh9f0ad1d_0 conda-forge backports 1.0 pyhd8ed1ab_3 conda-forge backports.functools_lru_cache 1.6.4 pyhd8ed1ab_0 conda-forge beautifulsoup4 4.11.1 pyha770c72_0 conda-forge bleach 5.0.1 pyhd8ed1ab_0 conda-forge blosc 1.21.3 hafa529b_0 conda-forge bokeh 2.4.3 pyhd8ed1ab_3 conda-forge boltons 23.0.0 pypi_0 pypi brotli 1.0.9 h166bdaf_8 conda-forge brotli-bin 1.0.9 h166bdaf_8 conda-forge brotlipy 0.7.0 py38h0a891b7_1005 conda-forge bzip2 1.0.8 h7f98852_4 conda-forge c-ares 1.18.1 h7f98852_0 conda-forge ca-certificates 2022.12.7 ha878542_0 conda-forge cachey 0.2.1 pyh9f0ad1d_0 conda-forge cairo 1.16.0 ha61ee94_1014 conda-forge certifi 2022.12.7 pyhd8ed1ab_0 conda-forge cffi 1.15.1 py38h4a40e3a_3 conda-forge charset-normalizer 2.1.1 pyhd8ed1ab_0 conda-forge click 8.1.3 unix_pyhd8ed1ab_2 conda-forge cloudpickle 2.2.0 pyhd8ed1ab_0 conda-forge colorama 0.4.6 pyhd8ed1ab_0 conda-forge comm 0.1.2 pyhd8ed1ab_0 conda-forge conda 23.1.0 py38h578d9bd_0 conda-forge conda-package-handling 2.0.2 pyh38be061_0 conda-forge conda-package-streaming 0.7.0 pyhd8ed1ab_1 conda-forge contourpy 1.0.7 py38hfbd4bf9_0 conda-forge cryptography 39.0.0 py38h3d167d9_0 conda-forge ctxcore 0.2.0 pypi_0 pypi cycler 0.11.0 pyhd8ed1ab_0 conda-forge cytoolz 0.12.0 py38h0a891b7_1 conda-forge dask 2023.1.0 pyhd8ed1ab_0 conda-forge dask-core 2023.1.0 pyhd8ed1ab_0 conda-forge dask-labextension 6.0.0 pyhd8ed1ab_0 conda-forge dbus 1.13.6 h5008d03_3 conda-forge debugpy 1.6.5 py38h8dc9893_0 conda-forge decorator 5.1.1 pyhd8ed1ab_0 conda-forge defusedxml 0.7.1 pyhd8ed1ab_0 conda-forge dill 0.3.6 pypi_0 pypi distributed 2023.1.0 pyhd8ed1ab_0 conda-forge entrypoints 0.4 pyhd8ed1ab_0 conda-forge executing 1.2.0 pyhd8ed1ab_0 conda-forge expat 2.5.0 h27087fc_0 conda-forge fftw 3.3.10 nompi_hf0379b8_106 conda-forge flit-core 3.8.0 pyhd8ed1ab_0 conda-forge fmt 9.1.0 h924138e_0 conda-forge font-ttf-dejavu-sans-mono 2.37 hab24e00_0 conda-forge font-ttf-inconsolata 3.000 h77eed37_0 conda-forge font-ttf-source-code-pro 2.038 h77eed37_0 conda-forge font-ttf-ubuntu 0.83 hab24e00_0 conda-forge fontconfig 2.14.2 h14ed4e7_0 conda-forge fonts-conda-ecosystem 1 0 conda-forge fonts-conda-forge 1 0 conda-forge fonttools 4.39.0 py38h1de0b5d_0 conda-forge freetype 2.12.1 hca18f0e_1 conda-forge frozendict 2.3.5 pypi_0 pypi frozenlist 1.3.3 py38h0a891b7_0 conda-forge fsspec 2022.11.0 pyhd8ed1ab_0 conda-forge gettext 0.21.1 h27087fc_0 conda-forge glib 2.74.1 h6239696_1 conda-forge glib-tools 2.74.1 h6239696_1 conda-forge glpk 5.0 h445213a_0 conda-forge gmp 6.2.1 h58526e2_0 conda-forge graphite2 1.3.13 h58526e2_1001 conda-forge gst-plugins-base 1.22.0 h4243ec0_2 conda-forge gstreamer 1.22.0 h25f0c4b_2 conda-forge gstreamer-orc 0.4.33 h166bdaf_0 conda-forge h5py 3.8.0 pypi_0 pypi harfbuzz 6.0.0 h8e241bc_0 conda-forge heapdict 1.0.1 py_0 conda-forge icu 70.1 h27087fc_0 conda-forge idna 3.4 pyhd8ed1ab_0 conda-forge igraph 0.10.3 hb9ddf80_0 conda-forge importlib-metadata 6.0.0 pyha770c72_0 conda-forge importlib-resources 5.10.2 pyhd8ed1ab_0 conda-forge importlib_resources 5.10.2 pyhd8ed1ab_0 conda-forge interlap 0.2.7 pypi_0 pypi ipykernel 6.20.1 pyh210e3f2_0 conda-forge ipython 8.8.0 pyh41d4057_0 conda-forge ipython_genutils 0.2.0 py_1 conda-forge ipywidgets 8.0.4 pyhd8ed1ab_0 conda-forge jack 1.9.22 h11f4161_0 conda-forge jedi 0.18.2 pyhd8ed1ab_0 conda-forge jinja2 3.1.2 pyhd8ed1ab_1 conda-forge joblib 1.2.0 pypi_0 pypi jpeg 9e h166bdaf_2 conda-forge json5 0.9.5 pyh9f0ad1d_0 conda-forge jsonschema 4.17.3 pyhd8ed1ab_0 conda-forge jupyter-server-proxy 3.2.2 pyhd8ed1ab_0 conda-forge jupyter_client 7.4.9 pyhd8ed1ab_0 conda-forge jupyter_core 5.1.3 py38h578d9bd_0 conda-forge jupyter_events 0.6.3 pyhd8ed1ab_0 conda-forge jupyter_server 2.1.0 pyhd8ed1ab_0 conda-forge jupyter_server_terminals 0.4.4 pyhd8ed1ab_1 conda-forge jupyterlab 3.5.2 pyhd8ed1ab_0 conda-forge jupyterlab_pygments 0.2.2 pyhd8ed1ab_0 conda-forge jupyterlab_server 2.18.0 pyhd8ed1ab_0 conda-forge jupyterlab_widgets 3.0.5 pyhd8ed1ab_0 conda-forge keyutils 1.6.1 h166bdaf_0 conda-forge kiwisolver 1.4.4 py38h43d8883_1 conda-forge krb5 1.20.1 h81ceb04_0 conda-forge lame 3.100 h166bdaf_1003 conda-forge lcms2 2.14 hfd0df8a_1 conda-forge ld_impl_linux-64 2.39 hcc3a1bd_1 conda-forge lerc 4.0.0 h27087fc_0 conda-forge libarchive 3.6.2 h3d51595_0 conda-forge libblas 3.9.0 16_linux64_openblas conda-forge libbrotlicommon 1.0.9 h166bdaf_8 conda-forge libbrotlidec 1.0.9 h166bdaf_8 conda-forge libbrotlienc 1.0.9 h166bdaf_8 conda-forge libcap 2.66 ha37c62d_0 conda-forge libcblas 3.9.0 16_linux64_openblas conda-forge libclang 15.0.7 default_had23c3d_1 conda-forge libclang13 15.0.7 default_h3e3d535_1 conda-forge libcups 2.3.3 h36d4200_3 conda-forge libcurl 7.87.0 hdc1c0ab_0 conda-forge libdb 6.2.32 h9c3ff4c_0 conda-forge libdeflate 1.14 h166bdaf_0 conda-forge libedit 3.1.20191231 he28a2e2_2 conda-forge libev 4.33 h516909a_1 conda-forge libevent 2.1.10 h28343ad_4 conda-forge libffi 3.4.2 h7f98852_5 conda-forge libflac 1.4.2 h27087fc_0 conda-forge libgcc-ng 12.2.0 h65d4601_19 conda-forge libgcrypt 1.10.1 h166bdaf_0 conda-forge libgfortran-ng 12.2.0 h69a702a_19 conda-forge libgfortran5 12.2.0 h337968e_19 conda-forge libglib 2.74.1 h606061b_1 conda-forge libgomp 12.2.0 h65d4601_19 conda-forge libgpg-error 1.46 h620e276_0 conda-forge libhwloc 2.9.0 hd6dc26d_0 conda-forge libiconv 1.17 h166bdaf_0 conda-forge libjpeg-turbo 2.1.4 h166bdaf_0 conda-forge liblapack 3.9.0 16_linux64_openblas conda-forge libllvm15 15.0.7 hadd5161_0 conda-forge libmamba 1.1.0 hde2b089_3 conda-forge libmambapy 1.1.0 py38h7fa060d_3 conda-forge libnghttp2 1.51.0 hff17c54_0 conda-forge libnsl 2.0.0 h7f98852_0 conda-forge libogg 1.3.4 h7f98852_1 conda-forge libopenblas 0.3.21 pthreads_h78a6416_3 conda-forge libopus 1.3.1 h7f98852_1 conda-forge libpng 1.6.39 h753d276_0 conda-forge libpq 15.2 hb675445_0 conda-forge libsndfile 1.2.0 hb75c966_0 conda-forge libsodium 1.0.18 h36c2ea0_1 conda-forge libsolv 0.7.23 h3eb15da_0 conda-forge libsqlite 3.40.0 h753d276_0 conda-forge libssh2 1.10.0 hf14f497_3 conda-forge libstdcxx-ng 12.2.0 h46fd767_19 conda-forge libsystemd0 252 h2a991cd_0 conda-forge libtiff 4.5.0 h82bc61c_0 conda-forge libtool 2.4.7 h27087fc_0 conda-forge libudev1 253 h0b41bf4_0 conda-forge libuuid 2.32.1 h7f98852_1000 conda-forge libvorbis 1.3.7 h9c3ff4c_0 conda-forge libwebp-base 1.2.4 h166bdaf_0 conda-forge libxcb 1.13 h7f98852_1004 conda-forge libxkbcommon 1.5.0 h79f4944_1 conda-forge libxml2 2.10.3 h7463322_0 conda-forge libzlib 1.2.13 h166bdaf_4 conda-forge llvmlite 0.39.1 pypi_0 pypi locket 1.0.0 pyhd8ed1ab_0 conda-forge loompy 3.0.7 pypi_0 pypi louvain 0.8.0 py38hfa26641_1 conda-forge lz4 4.2.0 py38hd012fdc_0 conda-forge lz4-c 1.9.3 h9c3ff4c_1 conda-forge lzo 2.10 h516909a_1000 conda-forge mamba 1.1.0 py38haad2881_3 conda-forge markupsafe 2.1.1 py38h0a891b7_2 conda-forge matplotlib 3.7.1 py38h578d9bd_0 conda-forge matplotlib-base 3.7.1 py38hd6c3c57_0 conda-forge matplotlib-inline 0.1.6 pyhd8ed1ab_0 conda-forge metis 5.1.0 h58526e2_1006 conda-forge mistune 2.0.4 pyhd8ed1ab_0 conda-forge mpfr 4.2.0 hb012696_0 conda-forge mpg123 1.31.2 hcb278e6_0 conda-forge msgpack-python 1.0.4 py38h43d8883_1 conda-forge multicore-tsne 0.1_d4ff4aab py38h2b96118_2 conda-forge multidict 6.0.4 py38h1de0b5d_0 conda-forge multiprocessing-on-dill 3.5.0a4 pypi_0 pypi munkres 1.1.4 pyh9f0ad1d_0 conda-forge mysql-common 8.0.32 ha901b37_0 conda-forge mysql-libs 8.0.32 hd7da12d_0 conda-forge natsort 8.3.1 pypi_0 pypi nbclassic 0.4.8 pyhd8ed1ab_0 conda-forge nbclient 0.7.2 pyhd8ed1ab_0 conda-forge nbconvert 7.2.7 pyhd8ed1ab_0 conda-forge nbconvert-core 7.2.7 pyhd8ed1ab_0 conda-forge nbconvert-pandoc 7.2.7 pyhd8ed1ab_0 conda-forge nbformat 5.7.3 pyhd8ed1ab_0 conda-forge ncurses 6.3 h27087fc_1 conda-forge nest-asyncio 1.5.6 pyhd8ed1ab_0 conda-forge networkx 3.0 pypi_0 pypi nomkl 1.0 h5ca1d4c_0 conda-forge notebook 6.5.2 pyha770c72_1 conda-forge notebook-shim 0.2.2 pyhd8ed1ab_0 conda-forge nspr 4.35 h27087fc_0 conda-forge nss 3.89 he45b914_0 conda-forge numba 0.56.4 pypi_0 pypi numexpr 2.8.4 pypi_0 pypi numpy 1.23.5 pypi_0 pypi numpy-groupies 0.9.20 pypi_0 pypi openjpeg 2.5.0 hfec8fc6_2 conda-forge openssl 3.0.8 h0b41bf4_0 conda-forge packaging 23.0 pyhd8ed1ab_0 conda-forge pandas 1.5.2 py38hdc8b05c_2 conda-forge pandoc 2.19.2 h32600fe_1 conda-forge pandocfilters 1.5.0 pyhd8ed1ab_0 conda-forge parso 0.8.3 pyhd8ed1ab_0 conda-forge partd 1.3.0 pyhd8ed1ab_0 conda-forge patsy 0.5.3 pyhd8ed1ab_0 conda-forge pcre2 10.40 hc3806b6_0 conda-forge pexpect 4.8.0 pyh1a96a4e_2 conda-forge pickleshare 0.7.5 py_1003 conda-forge pillow 9.4.0 py38hb32c036_0 conda-forge pip 22.3.1 pyhd8ed1ab_0 conda-forge pixman 0.40.0 h36c2ea0_0 conda-forge pkgutil-resolve-name 1.3.10 pyhd8ed1ab_0 conda-forge platformdirs 2.6.2 pyhd8ed1ab_0 conda-forge pluggy 1.0.0 pyhd8ed1ab_5 conda-forge ply 3.11 py_1 conda-forge pooch 1.7.0 pyhd8ed1ab_0 conda-forge prometheus_client 0.15.0 pyhd8ed1ab_0 conda-forge prompt-toolkit 3.0.36 pyha770c72_0 conda-forge psutil 5.9.4 py38h0a891b7_0 conda-forge pthread-stubs 0.4 h36c2ea0_1001 conda-forge ptyprocess 0.7.0 pyhd3deb0d_0 conda-forge pulseaudio 16.1 ha8d29e2_1 conda-forge pure_eval 0.2.2 pyhd8ed1ab_0 conda-forge pyarrow 11.0.0 pypi_0 pypi pybind11-abi 4 hd8ed1ab_3 conda-forge pycosat 0.6.4 py38h0a891b7_1 conda-forge pycparser 2.21 pyhd8ed1ab_0 conda-forge pygments 2.14.0 pyhd8ed1ab_0 conda-forge pynndescent 0.5.8 pypi_0 pypi pyopenssl 23.0.0 pyhd8ed1ab_0 conda-forge pyparsing 3.0.9 pyhd8ed1ab_0 conda-forge pyqt 5.15.7 py38ha0d8c90_3 conda-forge pyqt5-sip 12.11.0 py38h8dc9893_3 conda-forge pyrsistent 0.19.3 py38h1de0b5d_0 conda-forge pyscenic 0.12.1 pypi_0 pypi pysocks 1.7.1 pyha2e5f31_6 conda-forge python 3.8.15 h4a9ceb5_0_cpython conda-forge python-blosc 1.10.6 py38h8f669ce_1 conda-forge python-dateutil 2.8.2 pyhd8ed1ab_0 conda-forge python-fastjsonschema 2.16.2 pyhd8ed1ab_0 conda-forge python-igraph 0.10.4 py38hd98a34f_0 conda-forge python-json-logger 2.0.4 pyhd8ed1ab_0 conda-forge python_abi 3.8 3_cp38 conda-forge pytz 2022.7 pyhd8ed1ab_0 conda-forge pyyaml 6.0 py38h0a891b7_5 conda-forge pyzmq 25.0.0 py38he24dcef_0 conda-forge qt-main 5.15.8 h5d23da1_6 conda-forge readline 8.1.2 h0f457ee_0 conda-forge reproc 14.2.4 h0b41bf4_0 conda-forge reproc-cpp 14.2.4 hcb278e6_0 conda-forge requests 2.28.1 pyhd8ed1ab_1 conda-forge rfc3339-validator 0.1.4 pyhd8ed1ab_0 conda-forge rfc3986-validator 0.1.1 pyh9f0ad1d_0 conda-forge ruamel.yaml 0.17.21 py38h0a891b7_2 conda-forge ruamel.yaml.clib 0.2.7 py38h1de0b5d_1 conda-forge scanpy 1.9.3 pypi_0 pypi scikit-learn 1.2.2 pypi_0 pypi scipy 1.10.1 py38h10c12cc_0 conda-forge seaborn 0.12.2 hd8ed1ab_0 conda-forge seaborn-base 0.12.2 pyhd8ed1ab_0 conda-forge send2trash 1.8.0 pyhd8ed1ab_0 conda-forge session-info 1.0.0 pypi_0 pypi setuptools 65.6.3 pyhd8ed1ab_0 conda-forge simpervisor 0.4 pyhd8ed1ab_0 conda-forge sip 6.7.7 py38h8dc9893_0 conda-forge six 1.16.0 pyh6c4a22f_0 conda-forge snappy 1.1.9 hbd366e4_2 conda-forge sniffio 1.3.0 pyhd8ed1ab_0 conda-forge sortedcontainers 2.4.0 pyhd8ed1ab_0 conda-forge soupsieve 2.3.2.post1 pyhd8ed1ab_0 conda-forge stack_data 0.6.2 pyhd8ed1ab_0 conda-forge statsmodels 0.13.5 py38h26c90d9_2 conda-forge stdlib-list 0.8.0 pypi_0 pypi streamz 0.6.4 pyh6c4a22f_0 conda-forge suitesparse 5.10.1 h9e50725_1 conda-forge tbb 2021.8.0 hf52228f_0 conda-forge tblib 1.7.0 pyhd8ed1ab_0 conda-forge terminado 0.17.1 pyh41d4057_0 conda-forge texttable 1.6.7 pyhd8ed1ab_0 conda-forge threadpoolctl 3.1.0 pypi_0 pypi tinycss2 1.2.1 pyhd8ed1ab_0 conda-forge tk 8.6.12 h27826a3_0 conda-forge toml 0.10.2 pyhd8ed1ab_0 conda-forge tomli 2.0.1 pyhd8ed1ab_0 conda-forge toolz 0.12.0 pyhd8ed1ab_0 conda-forge tornado 6.2 py38h0a891b7_1 conda-forge tqdm 4.64.1 pyhd8ed1ab_0 conda-forge traitlets 5.8.1 pyhd8ed1ab_0 conda-forge typing-extensions 4.4.0 hd8ed1ab_0 conda-forge typing_extensions 4.4.0 pyha770c72_0 conda-forge umap-learn 0.5.3 pypi_0 pypi unicodedata2 15.0.0 py38h0a891b7_0 conda-forge urllib3 1.26.14 pyhd8ed1ab_0 conda-forge wcwidth 0.2.5 pyh9f0ad1d_2 conda-forge webencodings 0.5.1 py_1 conda-forge websocket-client 1.4.2 pyhd8ed1ab_0 conda-forge wheel 0.38.4 pyhd8ed1ab_0 conda-forge widgetsnbextension 4.0.5 pyhd8ed1ab_0 conda-forge xcb-util 0.4.0 h166bdaf_0 conda-forge xcb-util-image 0.4.0 h166bdaf_0 conda-forge xcb-util-keysyms 0.4.0 h166bdaf_0 conda-forge xcb-util-renderutil 0.3.9 h166bdaf_0 conda-forge xcb-util-wm 0.4.1 h166bdaf_0 conda-forge xkeyboard-config 2.38 h0b41bf4_0 conda-forge xorg-kbproto 1.0.7 h7f98852_1002 conda-forge xorg-libice 1.0.10 h7f98852_0 conda-forge xorg-libsm 1.2.3 hd9c2040_1000 conda-forge xorg-libx11 1.8.4 h0b41bf4_0 conda-forge xorg-libxau 1.0.9 h7f98852_0 conda-forge xorg-libxdmcp 1.1.3 h7f98852_0 conda-forge xorg-libxext 1.3.4 h0b41bf4_2 conda-forge xorg-libxrender 0.9.10 h7f98852_1003 conda-forge xorg-renderproto 0.11.1 h7f98852_1002 conda-forge xorg-xextproto 7.3.0 h0b41bf4_1003 conda-forge xorg-xproto 7.0.31 h7f98852_1007 conda-forge xz 5.2.6 h166bdaf_0 conda-forge yaml 0.2.5 h7f98852_2 conda-forge yaml-cpp 0.7.0 h27087fc_2 conda-forge yarl 1.8.2 py38h0a891b7_0 conda-forge zeromq 4.3.4 h9c3ff4c_1 conda-forge zict 2.2.0 pyhd8ed1ab_0 conda-forge zipp 3.11.0 pyhd8ed1ab_0 conda-forge zlib 1.2.13 h166bdaf_4 conda-forge zstandard 0.19.0 py38h5945529_1 conda-forge zstd 1.5.2 h6239696_4 conda-forge ```

Bug Modules for iRegulons

opened on 2023-03-07 00:35:15 by StevenTur

Hi,

I got a bug when I want to create the modules for iRegulons.

here is my error :

image

Can you help me?

Thank you,

Steven

f_db_names outputs empty

opened on 2023-03-06 06:01:09 by Jiayi-Zheng

Hello, I was working step-by-step with https://github.com/aertslab/SCENICprotocol/blob/master/notebooks/PBMC10k_SCENIC-protocol-CLI.ipynb During STEP 2-3: Regulon prediction aka cisTarget from CLI, I did:

``` import glob

ranking databases

f_db_glob = "/gene_based/*feather" f_db_names = ' '.join( glob.glob(f_db_glob) ) ```

I downloaded all six files from https://resources.aertslab.org/cistarget/databases/homo_sapiens/hg38/refseq_r80/mc9nr/gene_based/ (also tried https://resources.aertslab.org/cistarget/databases/homo_sapiens/hg38/refseq_r80/mc_v10_clust/gene_based/ ) and named the folder containing these six files as gene_based. However, when I wanted to check the output of f_db_names it gives blank outcome. I'm not sure what is happening here...?

Meanwhile, really appreciate it if you could explain a bit more on When using the modules_from_adjacencies function directly in python instead of via the command line, the rho_mask_dropouts option can be used to control this.? I'm still not sure why running this step in python is better than command line directly (and I'm currently having some problem with it, as I could run pyscenic in command line with the conda env on, however, with working under kernel in the same conda env, !pyscenic returns /bin/bash: pyscenic: command not found )...

Thank you very much!

[BUG]pyscenic aucell generated totally same loom file as the input loom, but can be correct if "csv"

opened on 2023-03-03 20:41:36 by KOBE24DUNK

Hi, I've tried the whole pyscenic workflow several times, but every time on the step3 - 'pyscenic aucell' I would encounter this bug: pyscenic aucell \ /home/aj186/10X_oJIA_v3_re-analysis/SF_Teff+Treg/data/3_integrated_Seurat_SF_Teff+Treg_filtered.loom \ $res_dir/3A_integrated_SF_Teff+Treg_pySCENIC_CTX_regulons.csv \ --output $res_dir/3A_integrated_SF_Teff+Treg_2022-05-16_pySCENIC_filtered.loom \ --num_workers 20 If I set the output extension as csv, then I would get the right Regulon_AUC.csv file; but if I set it as loom (like above), then the CLI will feed back to me a totally same loom file as the input loom file. It was so strange. Now I hope to have this output.loom to generate a series of plots by running "add.visualization.py". Any suggestions will be much appreciated.

AttributeError: module 'scanpy' has no attribute 'utils'

opened on 2023-02-13 06:42:47 by yanwuanxin

when run sc.utils.sanitize_anndata(adata), it returns: AttributeError: module 'scanpy' has no attribute 'utils'

I didn't find any information about sanitize_anndata under the path of scanpy, does anyone encounter a similar error? the version of scanpy which i use is 1.8.2

Releases

0.12.1 2022-11-21 12:46:12

Updates:

  • Add support for running arboreto_with_multiprocessing.py with spawn instead of fork as multiprocessing method.Pool
  • Use ravel instead of flatten to avoid unnecessary memory copy in aucell
  • Update Docker image file and add separated Docker file for pySCENIC with scanpy.

2022-08-16 12:35:46

Updates: - Only databases in Feather v2 format are supported now (ctxcore >= 0.2), which allow uses recent versions of pyarrow (>=8.0.0) instead of very old ones (<0.17). Databases in the new format can be downloaded from https://resources.aertslab.org/cistarget/databases/ and end with .genes_vs_motifs.rankings.feather or .genes_vs_tracks.rankings.feather. - Support clustered motif databases. - Use custom multiprocessing instead of dask, by default. - Docker image uses python 3.10 and contains only needed pySCENIC dependencies for CLI usage. - Remove unneeded scripts and notebooks for unused/deprecated database formats.

0.11.2 2021-05-07 19:30:13

Major changes: * Split some core cisTarget functions out into a separate repository, ctxcore. This is now a required package for pySCENIC. * Documentation updates

0.11.1 2021-04-16 14:14:35

  • Fix bug in motif url construction (#275)
  • Fix for export2loom with sparse dataframe (#278)
  • Fix sklearn t-SNE import (#285)
  • Updates to Docker image (expose port 8787 for Dask dashboard)

0.11.0 2021-02-10 13:50:24

Major features:

  • Updated Arboreto release (GRN inference step) includes:
  • Support for sparse matrices (using the --sparse flag in pyscenic grn, or passing a sparse matrix to grnboost2/genie3).
  • Fixes to avoid dask metadata mismatch error

  • Updated cisTarget:

  • Fix for metadata mismatch in ctx prune2df step
  • Support for databases Apache Parquet format
  • Faster loading from feather databases
  • Bugfix: loading genes from a database (previously missing the last gene name in the database)

  • Support for Anndata input and output

  • Package updates:

  • Upgrade to newer pandas version
  • Upgrade to newer numba version
  • Upgrade to newer versions of dask, distributed

  • Input checks and more descriptive error messages.

  • Check that regulons loaded are not empty.

  • Bugfixes:

  • In the regulons output from the cisTarget step, the gene weights were incorrectly assigned to their respective target genes (PR #254).
  • Motif url construction fixed when running ctx without pruning
  • Compression of intermediate files in the CLI steps
  • Handle loom files with non-standard gene/cell attribute names
  • Reformat the genesig gmt input/output
  • Fix AUCell output to loom with non-standard loom attributes

0.10.4 2020-11-24 10:40:43

Updates: - Included new (optional) CLI option to add correlation information to the GRN adjacencies file. This can be called with pyscenic add_cor. (vib-singlecell-nf/vsn-pipelines/issues/254) - The correlation calculation is subsequently skipped if using this adjacencies + correlations file as the input into pyscenic ctx.

single-cell transcriptomics gene-regulatory-network transcription-factors