An ML project template with sensible defaults: - Dockerised dev setup - Unit test setup - Automated tests for model metrics - CI pipeline as code
For infrastructure-related stuff (e.g. provisioning of CI server, deployments, etc.), please refer to https://github.com/ThoughtWorksInc/ml-cd-starter-kit.
git clone https://github.com/YOUR_USERNAME/ml-app-template
pipenv install
then activate environment with pipenv shell
3.b. to run anything without activating the virtual environment, for example, nosetests, try pipenv run nosetests
```shell
docker build . -t ml-app-template
MSYS_NO_PATHCONV=1 docker build . -t ml-app-template
docker run -it -v $(pwd):/home/ml-app-template \ -p 8080:8080 \ -p 8888:8888 \ ml-app-template bash
winpty docker run -it -v C:\Users\path\to\your\ml-app-template:/home/ml-app-template -p 8080:8080 -p 8888:8888 ml-app-template bash
pwd
in git bash, and manually replace forward slashes (/) with double backslashes (\)```
You're ready to roll! Here are some common commands that you can run in your dev workflow. Run these in the container.
```shell
source bin/color_my_terminal.sh
pipenv shell
nosetests
nosetests --with-watch --rednose --nologcapture
SHOULD_USE_MLFLOW=false python src/train.py
python src/app.py
bin/predict.sh http://localhost:8080
bin/predict.sh http://my-app.com ```
Here are some other commands that you may find useful ```shell
docker ps
docker exec -it
jupyter notebook --ip 0.0.0.0 --no-browser --allow-root ```
We've created a project template to help you with the boilerplate code that we usually have to write in any typical project.
To reduce incidental complexity, we used a simple dataset (boston housing prices) to train a simple linear regression model. Replace the (i) data, (ii) data preprocessing code and (iii) model specification for your use case.
This is the project structure:
```sh
.
βββ Dockerfile
βββ README.md
βββ requirements-dev.txt # specify dev dependencies (e.g. jupyter) here
βββ requirements.txt # specify app dependencies here
βββ ci.gocd.yaml # specify your CI pipeline here
βββ src # place your code here
βββ app.py
βββ app_with_logging.py
βββ tests # place your tests here
βΒ Β βββ test.py
βΒ Β βββ test_model_metrics.py
βββ settings.py # define environment variables here
βββ train.py
βββ bin # store shell scripts here
βΒ Β βββ color_my_terminal.sh
βΒ Β βββ configure_venv_locally.sh
βΒ Β βββ predict.sh
βΒ Β βββ start_server.sh
βΒ Β βββ test.sh
βΒ Β βββ test_model_metrics.sh
βΒ Β βββ train_model.sh
βββ docs
βΒ Β βββ FAQs.md
βΒ Β βββ mlflow.md
βββ models # serialize stuff here
βΒ Β βββ _keep
βΒ Β βββ column_order.joblib
βΒ Β βββ model.joblib
```
For logging, app_with_logging.py
contains the code for logging (i) inputs to the model, (ii) model outputs and (iii) LIME metrics. You can refer to this file to send logs to elasticsearch using fluentd. To keep the main app simple to accessible to people who may not be familiar with these technologies, we've kept it in a separate file app_with_logging.py
for reference.
Please refer to FAQs for instructions on how to configure VS Code or PyCharm to give you intellisense and auto-complete suggestions as you code.
To provision the infrastructure used in this repo (e.g. GoCD, MLFlow, EFK), please check out the ml-cd-starter-kit
repo and follow the instrutions in the README.
When you're done setting up the infrastructure, do the following:
- in src/settings.py
, update the ip addresses with that of your own infrastructure.
- in ci.gocd.yaml
, replace davified/ml-app-template
with YOUR_USERNAME/YOUR_IMAGE_NAME
If you encounter any errors, please refer to FAQs for a list of common errors and how to fix them.
Bumps ipython from 7.5.0 to 7.16.3.
d43c7c7
release 7.16.35fa1e40
Merge pull request from GHSA-pq7m-3gw7-gq5x8df8971
back to dev9f477b7
release 7.16.2138f266
bring back release helper from master branch5aa3634
Merge pull request #13341 from meeseeksmachine/auto-backport-of-pr-13335-on-7...bcae8e0
Backport PR #13335: What's new 7.16.28fcdcd3
Pin Jedi to <0.17.2.2486838
release 7.16.120bdc6f
fix conda buildDependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase
.
Bumps pillow from 6.0.0 to 8.3.2.
Sourced from pillow's releases.
8.3.2
https://pillow.readthedocs.io/en/stable/releasenotes/8.3.2.html
Security
CVE-2021-23437 Raise ValueError if color specifier is too long [hugovk, radarhere]
Fix 6-byte OOB read in FliDecode [wiredfool]
Python 3.10 wheels
Fixed regressions
Ensure TIFF
RowsPerStrip
is multiple of 8 for JPEG compression #5588 [kmilos, radarhere]Updates for
ImagePalette
channel order #5599 [radarhere]Hide FriBiDi shim symbols to avoid conflict with real FriBiDi library #5651 [nulano]
8.3.1
https://pillow.readthedocs.io/en/stable/releasenotes/8.3.1.html
Changes
- Catch OSError when checking if fp is sys.stdout #5585 [
@βradarhere
]- Handle removing orientation from alternate types of EXIF data #5584 [
@βradarhere
]- Make Image.array take optional dtype argument #5572 [
@βt-vi
]8.3.0
https://pillow.readthedocs.io/en/stable/releasenotes/8.3.0.html
Changes
- Use snprintf instead of sprintf #5567 [
@βradarhere
]- Limit TIFF strip size when saving with LibTIFF #5514 [
@βkmilos
]- Allow ICNS save on all operating systems #4526 [
@βnewpanjing
]- De-zigzag JPEG's DQT when loading; deprecate convert_dict_qtables #4989 [
@βgofr
]- Do not use background or transparency index for new color #5564 [
@βradarhere
]- Simplified code #5315 [
@βradarhere
]- Replaced xml.etree.ElementTree #5565 [
@βradarhere
]
... (truncated)
Sourced from pillow's changelog.
8.3.2 (2021-09-02)
CVE-2021-23437 Raise ValueError if color specifier is too long [hugovk, radarhere]
Fix 6-byte OOB read in FliDecode [wiredfool]
Add support for Python 3.10 #5569, #5570 [hugovk, radarhere]
Ensure TIFF
RowsPerStrip
is multiple of 8 for JPEG compression #5588 [kmilos, radarhere]Updates for
ImagePalette
channel order #5599 [radarhere]Hide FriBiDi shim symbols to avoid conflict with real FriBiDi library #5651 [nulano]
8.3.1 (2021-07-06)
Catch OSError when checking if fp is sys.stdout #5585 [radarhere]
Handle removing orientation from alternate types of EXIF data #5584 [radarhere]
Make Image.array take optional dtype argument #5572 [t-vi, radarhere]
8.3.0 (2021-07-01)
Use snprintf instead of sprintf. CVE-2021-34552 #5567 [radarhere]
Limit TIFF strip size when saving with LibTIFF #5514 [kmilos]
Allow ICNS save on all operating systems #4526 [baletu, radarhere, newpanjing, hugovk]
De-zigzag JPEG's DQT when loading; deprecate convert_dict_qtables #4989 [gofr, radarhere]
Replaced xml.etree.ElementTree #5565 [radarhere]
... (truncated)
8013f13
8.3.2 version bump23c7ca8
Update CHANGES.rst8450366
Update release notesa0afe89
Update test case9e08eb8
Raise ValueError if color specifier is too longbd5cf7d
FLI tests for Oss-fuzz crash.94a0cf1
Fix 6-byte OOB read in FliDecodecece64f
Add 8.3.2 (2021-09-02) [CI skip]e422386
Add release notes for Pillow 8.3.208dcbb8
Pillow 8.3.2 supports Python 3.10 [ci skip]Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase
.
Bumps urllib3 from 1.25.3 to 1.26.5.
Sourced from urllib3's releases.
1.26.5
:warning: IMPORTANT: urllib3 v2.0 will drop support for Python 2: Read more in the v2.0 Roadmap
- Fixed deprecation warnings emitted in Python 3.10.
- Updated vendored
six
library to 1.16.0.- Improved performance of URL parser when splitting the authority component.
If you or your organization rely on urllib3 consider supporting us via GitHub Sponsors
1.26.4
:warning: IMPORTANT: urllib3 v2.0 will drop support for Python 2: Read more in the v2.0 Roadmap
- Changed behavior of the default
SSLContext
when connecting to HTTPS proxy during HTTPS requests. The defaultSSLContext
now setscheck_hostname=True
.If you or your organization rely on urllib3 consider supporting us via GitHub Sponsors
1.26.3
:warning: IMPORTANT: urllib3 v2.0 will drop support for Python 2: Read more in the v2.0 Roadmap
Fixed bytes and string comparison issue with headers (Pull #2141)
Changed
ProxySchemeUnknown
error message to be more actionable if the user supplies a proxy URL without a scheme (Pull #2107)If you or your organization rely on urllib3 consider supporting us via GitHub Sponsors
1.26.2
:warning: IMPORTANT: urllib3 v2.0 will drop support for Python 2: Read more in the v2.0 Roadmap
- Fixed an issue where
wrap_socket
andCERT_REQUIRED
wouldn't be imported properly on Python 2.7.8 and earlier (Pull #2052)1.26.1
:warning: IMPORTANT: urllib3 v2.0 will drop support for Python 2: Read more in the v2.0 Roadmap
- Fixed an issue where two
User-Agent
headers would be sent if aUser-Agent
header key is passed asbytes
(Pull #2047)1.26.0
:warning: IMPORTANT: urllib3 v2.0 will drop support for Python 2: Read more in the v2.0 Roadmap
Added support for HTTPS proxies contacting HTTPS servers (Pull #1923, Pull #1806)
Deprecated negotiating TLSv1 and TLSv1.1 by default. Users that still wish to use TLS earlier than 1.2 without a deprecation warning should opt-in explicitly by setting
ssl_version=ssl.PROTOCOL_TLSv1_1
(Pull #2002) Starting in urllib3 v2.0: Connections that receive aDeprecationWarning
will failDeprecated
Retry
optionsRetry.DEFAULT_METHOD_WHITELIST
,Retry.DEFAULT_REDIRECT_HEADERS_BLACKLIST
andRetry(method_whitelist=...)
in favor ofRetry.DEFAULT_ALLOWED_METHODS
,Retry.DEFAULT_REMOVE_HEADERS_ON_REDIRECT
, andRetry(allowed_methods=...)
(Pull #2000) Starting in urllib3 v2.0: Deprecated options will be removed
... (truncated)
Sourced from urllib3's changelog.
1.26.5 (2021-05-26)
- Fixed deprecation warnings emitted in Python 3.10.
- Updated vendored
six
library to 1.16.0.- Improved performance of URL parser when splitting the authority component.
1.26.4 (2021-03-15)
- Changed behavior of the default
SSLContext
when connecting to HTTPS proxy during HTTPS requests. The defaultSSLContext
now setscheck_hostname=True
.1.26.3 (2021-01-26)
Fixed bytes and string comparison issue with headers (Pull #2141)
Changed
ProxySchemeUnknown
error message to be more actionable if the user supplies a proxy URL without a scheme. (Pull #2107)1.26.2 (2020-11-12)
- Fixed an issue where
wrap_socket
andCERT_REQUIRED
wouldn't be imported properly on Python 2.7.8 and earlier (Pull #2052)1.26.1 (2020-11-11)
- Fixed an issue where two
User-Agent
headers would be sent if aUser-Agent
header key is passed asbytes
(Pull #2047)1.26.0 (2020-11-10)
NOTE: urllib3 v2.0 will drop support for Python 2.
Read more in the v2.0 Roadmap <https://urllib3.readthedocs.io/en/latest/v2-roadmap.html>
_.Added support for HTTPS proxies contacting HTTPS servers (Pull #1923, Pull #1806)
Deprecated negotiating TLSv1 and TLSv1.1 by default. Users that still wish to use TLS earlier than 1.2 without a deprecation warning
... (truncated)
d161647
Release 1.26.52d4a3fe
Improve performance of sub-authority splitting in URL2698537
Update vendored six to 1.16.007bed79
Fix deprecation warnings for Python 3.10 ssl moduled725a9b
Add Python 3.10 to GitHub Actions339ad34
Use pytest==6.2.4 on Python 3.10+f271c9c
Apply latest Black formatting1884878
[1.26] Properly proxy EOF on the SSLTransport test suitea891304
Release 1.26.48d65ea1
Merge pull request from GHSA-5phf-pp7p-vc2rDependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase
.
Bumps pygments from 2.4.2 to 2.7.4.
Sourced from pygments's releases.
2.7.4
Updated lexers:
Fix infinite loop in SML lexer (#1625)
Fix backtracking string regexes in JavaScript/TypeScript, Modula2 and many other lexers (#1637)
Limit recursion with nesting Ruby heredocs (#1638)
Fix a few inefficient regexes for guessing lexers
Fix the raw token lexer handling of Unicode (#1616)
Revert a private API change in the HTML formatter (#1655) -- please note that private APIs remain subject to change!
Fix several exponential/cubic-complexity regexes found by Ben Caller/Doyensec (#1675)
Fix incorrect MATLAB example (#1582)
Thanks to Google's OSS-Fuzz project for finding many of these bugs.
2.7.3
... (truncated)
Sourced from pygments's changelog.
Version 2.7.4
(released January 12, 2021)
Updated lexers:
Fix infinite loop in SML lexer (#1625)
Fix backtracking string regexes in JavaScript/TypeScript, Modula2 and many other lexers (#1637)
Limit recursion with nesting Ruby heredocs (#1638)
Fix a few inefficient regexes for guessing lexers
Fix the raw token lexer handling of Unicode (#1616)
Revert a private API change in the HTML formatter (#1655) -- please note that private APIs remain subject to change!
Fix several exponential/cubic-complexity regexes found by Ben Caller/Doyensec (#1675)
Fix incorrect MATLAB example (#1582)
Thanks to Google's OSS-Fuzz project for finding many of these bugs.
Version 2.7.3
(released December 6, 2020)
... (truncated)
4d555d0
Bump version to 2.7.4.fc3b05d
Update CHANGES.ad21935
Revert "Added dracula theme style (#1636)"e411506
Prepare for 2.7.4 release.275e34d
doc: remove Perl 6 ref2e7e8c4
Fix several exponential/cubic complexity regexes found by Ben Caller/Doyenseceb39c43
xquery: fix pop from empty stack2738778
fix coding style in test_analyzer_lexer02e0f09
Added 'ERROR STOP' to fortran.py keywords. (#1665)c83fe48
support added for css variables (#1633)Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase
.
Bumps pyyaml from 5.1 to 5.4.
Sourced from pyyaml's changelog.
5.4 (2021-01-19)
- yaml/pyyaml#407 -- Build modernization, remove distutils, fix metadata, build wheels, CI to GHA
- yaml/pyyaml#472 -- Fix for CVE-2020-14343, moves arbitrary python tags to UnsafeLoader
- yaml/pyyaml#441 -- Fix memory leak in implicit resolver setup
- yaml/pyyaml#392 -- Fix py2 copy support for timezone objects
- yaml/pyyaml#378 -- Fix compatibility with Jython
5.3.1 (2020-03-18)
- yaml/pyyaml#386 -- Prevents arbitrary code execution during python/object/new constructor
5.3 (2020-01-06)
- yaml/pyyaml#290 -- Use
is
instead of equality for comparing withNone
- yaml/pyyaml#270 -- Fix typos and stylistic nit
- yaml/pyyaml#309 -- Fix up small typo
- yaml/pyyaml#161 -- Fix handling of slots
- yaml/pyyaml#358 -- Allow calling add_multi_constructor with None
- yaml/pyyaml#285 -- Add use of safe_load() function in README
- yaml/pyyaml#351 -- Fix reader for Unicode code points over 0xFFFF
- yaml/pyyaml#360 -- Enable certain unicode tests when maxunicode not > 0xffff
- yaml/pyyaml#359 -- Use full_load in yaml-highlight example
- yaml/pyyaml#244 -- Document that PyYAML is implemented with Cython
- yaml/pyyaml#329 -- Fix for Python 3.10
- yaml/pyyaml#310 -- Increase size of index, line, and column fields
- yaml/pyyaml#260 -- Remove some unused imports
- yaml/pyyaml#163 -- Create timezone-aware datetimes when parsed as such
- yaml/pyyaml#363 -- Add tests for timezone
5.2 (2019-12-02)
- Repair incompatibilities introduced with 5.1. The default Loader was changed, but several methods like add_constructor still used the old default yaml/pyyaml#279 -- A more flexible fix for custom tag constructors yaml/pyyaml#287 -- Change default loader for yaml.add_constructor yaml/pyyaml#305 -- Change default loader for add_implicit_resolver, add_path_resolver
- Make FullLoader safer by removing python/object/apply from the default FullLoader yaml/pyyaml#347 -- Move constructor for object/apply to UnsafeConstructor
- Fix bug introduced in 5.1 where quoting went wrong on systems with sys.maxunicode <= 0xffff yaml/pyyaml#276 -- Fix logic for quoting special characters
- Other PRs: yaml/pyyaml#280 -- Update CHANGES for 5.1
5.1.2 (2019-07-30)
- Re-release of 5.1 with regenerated Cython sources to build properly for Python 3.8b2+
... (truncated)
58d0cb7
5.4 releasea60f7a1
Fix compatibility with Jythonee98abd
Run CI on PR base branch changesddf2033
constructor.timezone: _copy & deepcopyfc914d5
Avoid repeatedly appending to yaml_implicit_resolversa001f27
Fix for CVE-2020-14343fe15062
Add 3.9 to appveyor file for completeness sake1e1c7fb
Add a newline character to end of pyproject.toml0b6b7d6
Start sentences and phrases for capital lettersc976915
Shell code improvementsDependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase
.
Bumps bleach from 3.1.0 to 3.3.0.
Sourced from bleach's changelog.
Version 3.3.0 (February 1st, 2021)
Backwards incompatible changes
- clean escapes HTML comments even when strip_comments=False
Security fixes
- Fix bug 1621692 / GHSA-m6xf-fq7q-8743. See the advisory for details.
Features
None
Bug fixes
None
Version 3.2.3 (January 26th, 2021)
Security fixes
None
Features
None
Bug fixes
- fix clean and linkify raising ValueErrors for certain inputs. Thank you
@Google-Autofuzz
.Version 3.2.2 (January 20th, 2021)
Security fixes
None
Features
- Migrate CI to Github Actions. Thank you
@hugovk
.Bug fixes
- fix linkify raising an IndexError on certain inputs. Thank you
@Google-Autofuzz
.Version 3.2.1 (September 18th, 2020)
... (truncated)
79b7a3c
Merge pull request from GHSA-vv2x-vrpj-qqpq842fcb4
Update for v3.3.0 release1334134
sanitizer: escape HTML commentsc045a8b
Merge pull request #581 from mozilla/nit-fixes491abb0
fix typo s/vnedoring/vendoring/10b1c5d
vendor: add html5lib-1.1.dist-info/REQUESTEDcd838c3
Merge pull request #579 from mozilla/validate-convert-entity-code-points612b808
Update for v3.2.3 release6879f6a
html5lib_shim: validate unicode points for convert_entity90cb80b
Update for v3.2.2 releaseDependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase
.
This is our sandbox GitHub organization for short-term experiments. For our main Github organisation, please visit @thoughtworks.
GitHub Repository