Utils and modules for Speech Language and Multimodal processing using pytorch and pytorch lightning

georgepar, updated 🕥 2023-02-16 03:00:13

slp

Python Version

slp is a framework for fast and reproducible development of multimodal models, with emphasis on NLP models.

It started as a collection of scripts and code I wrote / collected during my PhD and it evolves accordingly.

As such, the framework is opinionated and it follows a convention over configuration approach.

A heavy emphasis is put on:

  • Enforcing best practices and reproducibility of experiments
  • Making common things fast at the top-level and not having to go through extensive configuration options
  • Remaining extendable. Extensions and modules for more use cases should be easy to add
  • Out of the box extensive logging and experiment management
  • Separating dirty / scratch code (at the script level) for quick changes and clean / polished code at the library level

This is currently in alpha release under active development, so things may break and new features will be added.

Dependencies

We use Pytorch (1.7) and the following libraries

Installation

You can use slp as an external library by installing from PyPI with

pip install slp

Or you can clone it from github

git clone [email protected]:georgepar/slp

We use poetry for dependency management

When you clone the repo run:

bash pip install poetry poetry install

and a clean environment with all the dependencies will be created. You can access it with poetry shell.

Note: Wandb logging is enabled by default. You can either

  • Create an account and run wandb login when you clone the repo in a new machine to store the results in the online managed environment
  • Run wandb offline when you clone the repo to disable remote sync or use the --offline command line argument in your scripts
  • Use one of their self-hosted solutions

Create a new project based on slp

You can use the template at https://github.com/georgepar/cookiecutter-pytorch-slp to create a new project based on slp

``` pip install cookiecutter poetry cookiecutter gh:georgepar/cookiecutter-pytorch-slp

Follow the interactive configuration and a new folder with the project name you provided will appear

cd $PROJECT_NAME poetry install # Installs slp and all other dependencies ```

And you are good to go. Follow the instructions in the README of the new project you created. Happy coding

Contributing

You are welcome to open issues / PRs with improvements and bug fixes.

Since this is mostly a personal project based around workflows and practices that work for me, I don't guarantee I will accept every change, but I'm always open to discussion.

If you are going to contribute, please use the pre-commit hooks under hooks, otherwise the PR will not go through the CI. And never, ever touch requirements.txt by hand, it will automatically be exported from poetry

```bash

cat <> .git/hooks/pre-commit

!/usr/bin/env bash

bash hooks/export-requirements-txt bash hooks/checks EOT

chmod +x .git/hooks/pre-commit # Keep an up-to-date requirements.txt and run Linting, typechecking and tests

ln -s $(pwd)/hooks/commit-msg .git/hooks/commit-msg # Sign-off your commit ```

Cite

If you use this code for your research, please include the following citation

@ONLINE {, author = "Georgios Paraskevopoulos", title = "slp", year = "2020", url = "https://github.com/georgepar/slp" }

Roadmap

  • Optuna integration for hyperparameter tuning
  • Add dataloaders for popular multimodal datasets
  • Add multimodal architectures
  • Add RIM, DNC and Kanerva machine implementations
  • Write unit tests

Issues

Bump werkzeug from 1.0.1 to 2.2.3

opened on 2023-02-16 03:00:13 by dependabot[bot]

Bumps werkzeug from 1.0.1 to 2.2.3.

Release notes

Sourced from werkzeug's releases.

2.2.3

This is a fix release for the 2.2.x release branch.

This release contains security fixes for:

2.2.2

This is a fix release for the 2.2.0 feature release.

2.2.1

This is a fix release for the 2.2.0 feature release.

2.2.0

This is a feature release, which includes new features and removes previously deprecated features. The 2.2.x branch is now the supported bugfix branch, the 2.1.x branch will become a tag marking the end of support for that branch. We encourage everyone to upgrade, and to use a tool such as pip-tools to pin all dependencies and control upgrades.

2.1.2

This is a fix release for the 2.1.0 feature release.

2.1.1

This is a fix release for the 2.1.0 feature release.

2.1.0

This is a feature release, which includes new features and removes previously deprecated features. The 2.1.x branch is now the supported bugfix branch, the 2.0.x branch will become a tag marking the end of support for that branch. We encourage everyone to upgrade, and to use a tool such as pip-tools to pin all dependencies and control upgrades.

2.0.3

... (truncated)

Changelog

Sourced from werkzeug's changelog.

Version 2.2.3

Released 2023-02-14

  • Ensure that URL rules using path converters will redirect with strict slashes when the trailing slash is missing. :issue:2533
  • Type signature for get_json specifies that return type is not optional when silent=False. :issue:2508
  • parse_content_range_header returns None for a value like bytes */-1 where the length is invalid, instead of raising an AssertionError. :issue:2531
  • Address remaining ResourceWarning related to the socket used by run_simple. Remove prepare_socket, which now happens when creating the server. :issue:2421
  • Update pre-existing headers for multipart/form-data requests with the test client. :issue:2549
  • Fix handling of header extended parameters such that they are no longer quoted. :issue:2529
  • LimitedStream.read works correctly when wrapping a stream that may not return the requested size in one read call. :issue:2558
  • A cookie header that starts with = is treated as an empty key and discarded, rather than stripping the leading ==.
  • Specify a maximum number of multipart parts, default 1000, after which a RequestEntityTooLarge exception is raised on parsing. This mitigates a DoS attack where a larger number of form/file parts would result in disproportionate resource use.

Version 2.2.2

Released 2022-08-08

  • Fix router to restore the 2.1 strict_slashes == False behaviour whereby leaf-requests match branch rules and vice versa. :pr:2489
  • Fix router to identify invalid rules rather than hang parsing them, and to correctly parse / within converter arguments. :pr:2489
  • Update subpackage imports in :mod:werkzeug.routing to use the import as syntax for explicitly re-exporting public attributes. :pr:2493
  • Parsing of some invalid header characters is more robust. :pr:2494
  • When starting the development server, a warning not to use it in a production deployment is always shown. :issue:2480
  • LocalProxy.__wrapped__ is always set to the wrapped object when the proxy is unbound, fixing an issue in doctest that would cause it to fail. :issue:2485
  • Address one ResourceWarning related to the socket used by run_simple. :issue:2421

... (truncated)

Commits
  • 22a254f release version 2.2.3
  • 517cac5 Merge pull request from GHSA-xg9f-g7g7-2323
  • babc8d9 rewrite docs about request data limits
  • 09449ee clean up docs
  • fe899d0 limit the maximum number of multipart form parts
  • cf275f4 Merge pull request from GHSA-px8h-6qxv-m22q
  • 8c2b4b8 don't strip leading = when parsing cookie
  • 7c7ce5c [pre-commit.ci] pre-commit autoupdate (#2585)
  • 19ae03e [pre-commit.ci] auto fixes from pre-commit.com hooks
  • a83d3b8 [pre-commit.ci] pre-commit autoupdate
  • Additional commits viewable in compare view


Dependabot compatibility score

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


Dependabot commands and options
You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) - `@dependabot use these labels` will set the current labels as the default for future PRs for this repo and language - `@dependabot use these reviewers` will set the current reviewers as the default for future PRs for this repo and language - `@dependabot use these assignees` will set the current assignees as the default for future PRs for this repo and language - `@dependabot use this milestone` will set the current milestone as the default for future PRs for this repo and language You can disable automated security fix PRs for this repo from the [Security Alerts page](https://github.com/georgepar/slp/network/alerts).

Bump ipython from 7.21.0 to 8.10.0

opened on 2023-02-10 23:45:29 by dependabot[bot]

Bumps ipython from 7.21.0 to 8.10.0.

Commits
  • 15ea1ed release 8.10.0
  • 560ad10 DOC: Update what's new for 8.10 (#13939)
  • 7557ade DOC: Update what's new for 8.10
  • 385d693 Merge pull request from GHSA-29gw-9793-fvw7
  • e548ee2 Swallow potential exceptions from showtraceback() (#13934)
  • 0694b08 MAINT: mock slowest test. (#13885)
  • 8655912 MAINT: mock slowest test.
  • a011765 Isolate the attack tests with setUp and tearDown methods
  • c7a9470 Add some regression tests for this change
  • fd34cf5 Swallow potential exceptions from showtraceback()
  • Additional commits viewable in compare view


Dependabot compatibility score

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


Dependabot commands and options
You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) - `@dependabot use these labels` will set the current labels as the default for future PRs for this repo and language - `@dependabot use these reviewers` will set the current reviewers as the default for future PRs for this repo and language - `@dependabot use these assignees` will set the current assignees as the default for future PRs for this repo and language - `@dependabot use this milestone` will set the current milestone as the default for future PRs for this repo and language You can disable automated security fix PRs for this repo from the [Security Alerts page](https://github.com/georgepar/slp/network/alerts).

Bump certifi from 2020.12.5 to 2022.12.7

opened on 2022-12-08 08:51:29 by dependabot[bot]

Bumps certifi from 2020.12.5 to 2022.12.7.

Commits


Dependabot compatibility score

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


Dependabot commands and options
You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) - `@dependabot use these labels` will set the current labels as the default for future PRs for this repo and language - `@dependabot use these reviewers` will set the current reviewers as the default for future PRs for this repo and language - `@dependabot use these assignees` will set the current assignees as the default for future PRs for this repo and language - `@dependabot use this milestone` will set the current milestone as the default for future PRs for this repo and language You can disable automated security fix PRs for this repo from the [Security Alerts page](https://github.com/georgepar/slp/network/alerts).

Bump pillow from 7.2.0 to 9.3.0

opened on 2022-11-22 06:52:18 by dependabot[bot]

Bumps pillow from 7.2.0 to 9.3.0.

Release notes

Sourced from pillow's releases.

9.3.0

https://pillow.readthedocs.io/en/stable/releasenotes/9.3.0.html

Changes

... (truncated)

Changelog

Sourced from pillow's changelog.

9.3.0 (2022-10-29)

  • Limit SAMPLESPERPIXEL to avoid runtime DOS #6700 [wiredfool]

  • Initialize libtiff buffer when saving #6699 [radarhere]

  • Inline fname2char to fix memory leak #6329 [nulano]

  • Fix memory leaks related to text features #6330 [nulano]

  • Use double quotes for version check on old CPython on Windows #6695 [hugovk]

  • Remove backup implementation of Round for Windows platforms #6693 [cgohlke]

  • Fixed set_variation_by_name offset #6445 [radarhere]

  • Fix malloc in _imagingft.c:font_setvaraxes #6690 [cgohlke]

  • Release Python GIL when converting images using matrix operations #6418 [hmaarrfk]

  • Added ExifTags enums #6630 [radarhere]

  • Do not modify previous frame when calculating delta in PNG #6683 [radarhere]

  • Added support for reading BMP images with RLE4 compression #6674 [npjg, radarhere]

  • Decode JPEG compressed BLP1 data in original mode #6678 [radarhere]

  • Added GPS TIFF tag info #6661 [radarhere]

  • Added conversion between RGB/RGBA/RGBX and LAB #6647 [radarhere]

  • Do not attempt normalization if mode is already normal #6644 [radarhere]

... (truncated)

Commits


Dependabot compatibility score

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


Dependabot commands and options
You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) - `@dependabot use these labels` will set the current labels as the default for future PRs for this repo and language - `@dependabot use these reviewers` will set the current reviewers as the default for future PRs for this repo and language - `@dependabot use these assignees` will set the current assignees as the default for future PRs for this repo and language - `@dependabot use this milestone` will set the current milestone as the default for future PRs for this repo and language You can disable automated security fix PRs for this repo from the [Security Alerts page](https://github.com/georgepar/slp/network/alerts).

Bump protobuf from 3.15.5 to 3.18.3

opened on 2022-09-23 20:51:46 by dependabot[bot]

Bumps protobuf from 3.15.5 to 3.18.3.

Release notes

Sourced from protobuf's releases.

Protocol Buffers v3.18.3

C++

Protocol Buffers v3.16.1

Java

  • Improve performance characteristics of UnknownFieldSet parsing (#9371)

Protocol Buffers v3.18.2

Java

  • Improve performance characteristics of UnknownFieldSet parsing (#9371)

Protocol Buffers v3.18.1

Python

  • Update setup.py to reflect that we now require at least Python 3.5 (#8989)
  • Performance fix for DynamicMessage: force GetRaw() to be inlined (#9023)

Ruby

  • Update ruby_generator.cc to allow proto2 imports in proto3 (#9003)

Protocol Buffers v3.18.0

C++

  • Fix warnings raised by clang 11 (#8664)
  • Make StringPiece constructible from std::string_view (#8707)
  • Add missing capability attributes for LLVM 12 (#8714)
  • Stop using std::iterator (deprecated in C++17). (#8741)
  • Move field_access_listener from libprotobuf-lite to libprotobuf (#8775)
  • Fix #7047 Safely handle setlocale (#8735)
  • Remove deprecated version of SetTotalBytesLimit() (#8794)
  • Support arena allocation of google::protobuf::AnyMetadata (#8758)
  • Fix undefined symbol error around SharedCtor() (#8827)
  • Fix default value of enum(int) in json_util with proto2 (#8835)
  • Better Smaller ByteSizeLong
  • Introduce event filters for inject_field_listener_events
  • Reduce memory usage of DescriptorPool
  • For lazy fields copy serialized form when allowed.
  • Re-introduce the InlinedStringField class
  • v2 access listener
  • Reduce padding in the proto's ExtensionRegistry map.
  • GetExtension performance optimizations
  • Make tracker a static variable rather than call static functions
  • Support extensions in field access listener
  • Annotate MergeFrom for field access listener
  • Fix incomplete types for field access listener
  • Add map_entry/new_map_entry to SpecificField in MessageDifferencer. They record the map items which are different in MessageDifferencer's reporter.
  • Reduce binary size due to fieldless proto messages
  • TextFormat: ParseInfoTree supports getting field end location in addition to start.

... (truncated)

Commits


Dependabot compatibility score

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


Dependabot commands and options
You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) - `@dependabot use these labels` will set the current labels as the default for future PRs for this repo and language - `@dependabot use these reviewers` will set the current reviewers as the default for future PRs for this repo and language - `@dependabot use these assignees` will set the current assignees as the default for future PRs for this repo and language - `@dependabot use this milestone` will set the current milestone as the default for future PRs for this repo and language You can disable automated security fix PRs for this repo from the [Security Alerts page](https://github.com/georgepar/slp/network/alerts).

Bump mako from 1.1.4 to 1.2.2

opened on 2022-09-16 18:46:54 by dependabot[bot]

Bumps mako from 1.1.4 to 1.2.2.

Release notes

Sourced from mako's releases.

1.2.2

Released: Mon Aug 29 2022

bug

  • [bug] [lexer] Fixed issue in lexer where the regexp used to match tags would not correctly interpret quoted sections individually. While this parsing issue still produced the same expected tag structure later on, the mis-handling of quoted sections was also subject to a regexp crash if a tag had a large number of quotes within its quoted sections.

    References: #366

1.2.1

Released: Thu Jun 30 2022

bug

  • [bug] [tests] Various fixes to the test suite in the area of exception message rendering to accommodate for variability in Python versions as well as Pygments.

    References: #360

misc

  • [performance] Optimized some codepaths within the lexer/Python code generation process, improving performance for generation of templates prior to their being cached. Pull request courtesy Takuto Ikuta.

    References: #361

1.2.0

Released: Thu Mar 10 2022

changed

  • [changed] [py3k] Corrected "universal wheel" directive in setup.cfg so that building a wheel does not target Python 2.

    References: #351

  • [changed] [py3k] The bytestring_passthrough template argument is removed, as this flag only applied to Python 2.

... (truncated)

Commits


Dependabot compatibility score

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


Dependabot commands and options
You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) - `@dependabot use these labels` will set the current labels as the default for future PRs for this repo and language - `@dependabot use these reviewers` will set the current reviewers as the default for future PRs for this repo and language - `@dependabot use these assignees` will set the current assignees as the default for future PRs for this repo and language - `@dependabot use this milestone` will set the current milestone as the default for future PRs for this repo and language You can disable automated security fix PRs for this repo from the [Security Alerts page](https://github.com/georgepar/slp/network/alerts).

Releases

SLP Version 1.1.5 2021-03-10 23:05:14

  • Add Hyperparameter Tuning capabilities with ray tune
  • Cleanup code smells and progress on documentation
  • Various bug fixes
Giorgos Paraskevopoulos

PhD candidate at National Technical University of Athens

GitHub Repository

multimodal multimodal-learning multimodal-deep-learning pytorch pytorch-lightning wandb natural-language-processing