A set of tools and scripts that download and process blockchain and cryptocurrency course data, generate a dataset, use it to teach a deep learning neural network to make value predictions and evaluate the result.
The project implements the theoretical and experimental setup of a paper, which is currently undergoing peer review.
The tools require the installation of Parity client, Node.js, Python 3, Pipenv and optionally MongoDB.
The project includes C++ optimized code. Installation of the GCC Compiler, as well as the Pybind11 library is required in order to compile the C++ parts of the project.
Clone the git repository and install the node dependencies:
bash
git clone https://github.com/Zvezdin/blockchain-predictor.git
cd blockchain-predictor
npm install
Install the required python dependencies via the following script:
bash
pipenv install
Run the script build.sh
under c++
folder.
Proceed to use/run this project after running pipenv shell
All python tools implement a CLI with a help page. It can be displayed by running python something.py -h
.
Run a parity instance with --tracing on flag. A possible configuration could be:
bash
parity -d /some/where --tracing on --mode active --cache-size 16384 --force-sealing --allow-ips public --min-peers 50 --max-peers 100 --jsonrpc-threads 10
The initial sync can take multiple hours. Wait for full sync before proceeding.
There are multiple options for a data store as a backend. Available options are defined in database/
. By default, hdfs_store_database.py
is used and hence no database instance needs to be started. The filepath to the h5 store file is defined that database file (for now).
If instead you want to use arctic_store_database.py
, you have to first run an instance of MongoDB with:
bash
mongod --dbpath /path/to/your/db
The blockchain information needs to be downloaded from the running parity client to the database. This is done using:
bash
python arcticdb.py --course
python arcticdb.py --blockchain
It may take a while depending on which database is used.
Data properties are an extraction of the most important moments from the bulk raw data. They are generated for each course tick (time interval for which we have course data).
To generate all of the available properties for all downloaded data, run the following command:
bash
python property-generator.py --action generate
To generate one or more properties for all downloaded data, run the following command:
bash
python property-generator.py --action generate --properties openPrice,closePrice
After the needed data properties are generated, you can proceed with generating the actual dataset. The dataset is generated using a certain dataset model. There are multiple dataset models that "compile" the properties and structure the dataset in a different way. The default is matrix
, which generates matrices from a moving window over all of the properties.
Dataset generation requires providing a list of comma separated properties to be included in the body and also a list (or a single item) of comma separated properties as a target / expected output.
Example:
bash
python dataset_generator.py openPrice,closePrice stickPrice --filename some/where/dataset.pickle
Arguments --start
and --end
can be used as trimmers for the dataset:
bash
python dataset_generator.py openPrice,closePrice stickPrice --start 2017-03-14-03 --end 2017-07-03-21
In most cases when training neural networks, we will need two or three datasets - a train
, validation
(optional) and a test
dataset. These datasets can be generated using separate calls to our dataset_generator
for different dates, but we recommend to use one date interval that covers all our data and then split the resulting dataset into the needed parts. In our tool, this is done the following way:
bash
python dataset_generator.py openPrice,closePrice stickPrice --ratio 6:2:2
Please keep in mind that the matrix
model has dozens of hyperparameters that have been tuned for most cases. If your case differs, you need to change them in the source code of the matrix model.
The generated dataset can be used to train neural networks. The supported networks depend on the chosen dataset model. The matrix
model supports all networks.
To train our convolutional network on an already generated dataset and also shuffle the train dataset, we can do the following:
bash
python neural_trainer.py path/to/your/dataset.pickle --models CONV --shuffle
Training a neural network can't be that simple, right? Right! You ~~can~~ should override the default network hyperparameters to suit your dataset and problem needs. This can be done via:
bash
python neural_trainer.py data/test.pickle --models CONV --args epoch=5,batch=1,lr=0.0001,kernel=3
This example sets the number of training epochs, the batch size, learning rate and kernel size for the whole convolutional network. Each network architecture has its own set of hyperparameters and they are defined with the network specification itself.
After training, the network's performance will be evaluated with the test
dataset and measured by 4+ different accuracy/error scores. The performance on the train and test datasets will also be visualized on a graph by opening a new window. If you do not wish training to be blocked by a graph window, you can save the graph to a file instead, by passing the --quiet
parameter. This is useful for automated training of multiple networks, as it allows you to review the results afterwards.
Our other neural models include CustomDeep
, LSTM
and more to come.
If needed, this project also provides a low-level tool that can download data from a crypto exchange / a blockchain node and save it as a .json in a given directory (by the --filename some/where
argument).
To download and save course data for the whole history of the cryptocurrency, run:
bash
node data-downloader.js --course
To download blocks 10 through 100, use:
bash
node data-downloader.js --blockchain 10 100
Bumps certifi from 2018.8.24 to 2022.12.7.
9e9e840
2022.12.07b81bdb2
2022.09.24939a28f
2022.09.14aca828a
2022.06.15.2de0eae1
Only use importlib.resources's new files() / Traversable API on Python ≥3.11 ...b8eb5e9
2022.06.15.147fb7ab
Fix deprecation warning on Python 3.11 (#199)b0b48e0
fixes #198 -- update link in license9d514b4
2022.06.154151e88
Add py.typed to MANIFEST.in to package in sdist (#196)Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase
.
Bumps py from 1.6.0 to 1.10.0.
Sourced from py's changelog.
1.10.0 (2020-12-12)
- Fix a regular expression DoS vulnerability in the py.path.svnwc SVN blame functionality (CVE-2020-29651)
- Update vendored apipkg: 1.4 => 1.5
- Update vendored iniconfig: 1.0.0 => 1.1.1
1.9.0 (2020-06-24)
Add type annotation stubs for the following modules:
py.error
py.iniconfig
py.path
(not including SVN paths)py.io
py.xml
There are no plans to type other modules at this time.
The type annotations are provided in external .pyi files, not inline in the code, and may therefore contain small errors or omissions. If you use
py
in conjunction with a type checker, and encounter any type errors you believe should be accepted, please report it in an issue.1.8.2 (2020-06-15)
- On Windows,
py.path.local
s which differ only in case now have the same Python hash value. Previously, such paths were considered equal but had different hashes, which is not allowed and breaks the assumptions made by dicts, sets and other users of hashes.1.8.1 (2019-12-27)
Handle
FileNotFoundError
when trying to import pathlib inpath.common
on Python 3.4 (#207).
py.path.local.samefile
now works correctly in Python 3 on Windows when dealing with symlinks.1.8.0 (2019-02-21)
add
"importlib"
pyimport mode for python3.5+, allowing unimportable test suites to contain identically named modules.fix
LocalPath.as_cwd()
not callingos.chdir()
withNone
, when being invoked from a non-existing directory.
... (truncated)
e5ff378
Update CHANGELOG for 1.10.094cf44f
Update vendored libs5e8ded5
testing: comment out an assert which fails on Python 3.9 for nowafdffcc
Rename HOWTORELEASE.rst to RELEASING.rst2de53a6
Merge pull request #266 from nicoddemus/gh-actionsfa1b32e
Merge pull request #264 from hugovk/patch-2887d6b8
Skip test_samefile_symlink on pypy3 on Windowse94e670
Fix test_comments() in test_sourcefef9a32
Adapt test4a694b0
Add GitHub Actions badge to READMEDependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase
.
Bumps protobuf from 3.6.1 to 3.18.3.
Sourced from protobuf's releases.
Protocol Buffers v3.18.3
C++
- Reduce memory consumption of MessageSet parsing
- This release addresses a Security Advisory for C++ and Python users
Protocol Buffers v3.16.1
Java
- Improve performance characteristics of UnknownFieldSet parsing (#9371)
Protocol Buffers v3.18.2
Java
- Improve performance characteristics of UnknownFieldSet parsing (#9371)
Protocol Buffers v3.18.1
Python
- Update setup.py to reflect that we now require at least Python 3.5 (#8989)
- Performance fix for DynamicMessage: force GetRaw() to be inlined (#9023)
Ruby
- Update ruby_generator.cc to allow proto2 imports in proto3 (#9003)
Protocol Buffers v3.18.0
C++
- Fix warnings raised by clang 11 (#8664)
- Make StringPiece constructible from std::string_view (#8707)
- Add missing capability attributes for LLVM 12 (#8714)
- Stop using std::iterator (deprecated in C++17). (#8741)
- Move field_access_listener from libprotobuf-lite to libprotobuf (#8775)
- Fix #7047 Safely handle setlocale (#8735)
- Remove deprecated version of SetTotalBytesLimit() (#8794)
- Support arena allocation of google::protobuf::AnyMetadata (#8758)
- Fix undefined symbol error around SharedCtor() (#8827)
- Fix default value of enum(int) in json_util with proto2 (#8835)
- Better Smaller ByteSizeLong
- Introduce event filters for inject_field_listener_events
- Reduce memory usage of DescriptorPool
- For lazy fields copy serialized form when allowed.
- Re-introduce the InlinedStringField class
- v2 access listener
- Reduce padding in the proto's ExtensionRegistry map.
- GetExtension performance optimizations
- Make tracker a static variable rather than call static functions
- Support extensions in field access listener
- Annotate MergeFrom for field access listener
- Fix incomplete types for field access listener
- Add map_entry/new_map_entry to SpecificField in MessageDifferencer. They record the map items which are different in MessageDifferencer's reporter.
- Reduce binary size due to fieldless proto messages
- TextFormat: ParseInfoTree supports getting field end location in addition to start.
... (truncated)
a902b39
No-op whitespace changeae62acd
Updating version.json and repo version numbers to: 18.3f43ac49
Merge pull request #10542 from deannagarcia/3.18.x9efdf55
Add missing includesd1635e1
Apply patch5b37c91
Update version.json with "lts": true (#10534)c39d622
Merge pull request #10529 from protocolbuffers/deannagarcia-patch-5f77d3b6
Update version.json8178b06
Merge pull request #10503 from deannagarcia/3.18.x24ca839
Add version fileDependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase
.
Bumps ipython from 6.5.0 to 7.16.3.
d43c7c7
release 7.16.35fa1e40
Merge pull request from GHSA-pq7m-3gw7-gq5x8df8971
back to dev9f477b7
release 7.16.2138f266
bring back release helper from master branch5aa3634
Merge pull request #13341 from meeseeksmachine/auto-backport-of-pr-13335-on-7...bcae8e0
Backport PR #13335: What's new 7.16.28fcdcd3
Pin Jedi to <0.17.2.2486838
release 7.16.120bdc6f
fix conda buildDependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase
.
Bumps pillow from 5.2.0 to 8.3.2.
Sourced from pillow's releases.
8.3.2
https://pillow.readthedocs.io/en/stable/releasenotes/8.3.2.html
Security
CVE-2021-23437 Raise ValueError if color specifier is too long [hugovk, radarhere]
Fix 6-byte OOB read in FliDecode [wiredfool]
Python 3.10 wheels
Fixed regressions
Ensure TIFF
RowsPerStrip
is multiple of 8 for JPEG compression #5588 [kmilos, radarhere]Updates for
ImagePalette
channel order #5599 [radarhere]Hide FriBiDi shim symbols to avoid conflict with real FriBiDi library #5651 [nulano]
8.3.1
https://pillow.readthedocs.io/en/stable/releasenotes/8.3.1.html
Changes
- Catch OSError when checking if fp is sys.stdout #5585 [
@radarhere
]- Handle removing orientation from alternate types of EXIF data #5584 [
@radarhere
]- Make Image.array take optional dtype argument #5572 [
@t-vi
]8.3.0
https://pillow.readthedocs.io/en/stable/releasenotes/8.3.0.html
Changes
- Use snprintf instead of sprintf #5567 [
@radarhere
]- Limit TIFF strip size when saving with LibTIFF #5514 [
@kmilos
]- Allow ICNS save on all operating systems #4526 [
@newpanjing
]- De-zigzag JPEG's DQT when loading; deprecate convert_dict_qtables #4989 [
@gofr
]- Do not use background or transparency index for new color #5564 [
@radarhere
]- Simplified code #5315 [
@radarhere
]- Replaced xml.etree.ElementTree #5565 [
@radarhere
]
... (truncated)
Sourced from pillow's changelog.
8.3.2 (2021-09-02)
CVE-2021-23437 Raise ValueError if color specifier is too long [hugovk, radarhere]
Fix 6-byte OOB read in FliDecode [wiredfool]
Add support for Python 3.10 #5569, #5570 [hugovk, radarhere]
Ensure TIFF
RowsPerStrip
is multiple of 8 for JPEG compression #5588 [kmilos, radarhere]Updates for
ImagePalette
channel order #5599 [radarhere]Hide FriBiDi shim symbols to avoid conflict with real FriBiDi library #5651 [nulano]
8.3.1 (2021-07-06)
Catch OSError when checking if fp is sys.stdout #5585 [radarhere]
Handle removing orientation from alternate types of EXIF data #5584 [radarhere]
Make Image.array take optional dtype argument #5572 [t-vi, radarhere]
8.3.0 (2021-07-01)
Use snprintf instead of sprintf. CVE-2021-34552 #5567 [radarhere]
Limit TIFF strip size when saving with LibTIFF #5514 [kmilos]
Allow ICNS save on all operating systems #4526 [baletu, radarhere, newpanjing, hugovk]
De-zigzag JPEG's DQT when loading; deprecate convert_dict_qtables #4989 [gofr, radarhere]
Replaced xml.etree.ElementTree #5565 [radarhere]
... (truncated)
8013f13
8.3.2 version bump23c7ca8
Update CHANGES.rst8450366
Update release notesa0afe89
Update test case9e08eb8
Raise ValueError if color specifier is too longbd5cf7d
FLI tests for Oss-fuzz crash.94a0cf1
Fix 6-byte OOB read in FliDecodecece64f
Add 8.3.2 (2021-09-02) [CI skip]e422386
Add release notes for Pillow 8.3.208dcbb8
Pillow 8.3.2 supports Python 3.10 [ci skip]Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase
.
Bumps urllib3 from 1.23 to 1.26.5.
Sourced from urllib3's releases.
1.26.5
:warning: IMPORTANT: urllib3 v2.0 will drop support for Python 2: Read more in the v2.0 Roadmap
- Fixed deprecation warnings emitted in Python 3.10.
- Updated vendored
six
library to 1.16.0.- Improved performance of URL parser when splitting the authority component.
If you or your organization rely on urllib3 consider supporting us via GitHub Sponsors
1.26.4
:warning: IMPORTANT: urllib3 v2.0 will drop support for Python 2: Read more in the v2.0 Roadmap
- Changed behavior of the default
SSLContext
when connecting to HTTPS proxy during HTTPS requests. The defaultSSLContext
now setscheck_hostname=True
.If you or your organization rely on urllib3 consider supporting us via GitHub Sponsors
1.26.3
:warning: IMPORTANT: urllib3 v2.0 will drop support for Python 2: Read more in the v2.0 Roadmap
Fixed bytes and string comparison issue with headers (Pull #2141)
Changed
ProxySchemeUnknown
error message to be more actionable if the user supplies a proxy URL without a scheme (Pull #2107)If you or your organization rely on urllib3 consider supporting us via GitHub Sponsors
1.26.2
:warning: IMPORTANT: urllib3 v2.0 will drop support for Python 2: Read more in the v2.0 Roadmap
- Fixed an issue where
wrap_socket
andCERT_REQUIRED
wouldn't be imported properly on Python 2.7.8 and earlier (Pull #2052)1.26.1
:warning: IMPORTANT: urllib3 v2.0 will drop support for Python 2: Read more in the v2.0 Roadmap
- Fixed an issue where two
User-Agent
headers would be sent if aUser-Agent
header key is passed asbytes
(Pull #2047)1.26.0
:warning: IMPORTANT: urllib3 v2.0 will drop support for Python 2: Read more in the v2.0 Roadmap
Added support for HTTPS proxies contacting HTTPS servers (Pull #1923, Pull #1806)
Deprecated negotiating TLSv1 and TLSv1.1 by default. Users that still wish to use TLS earlier than 1.2 without a deprecation warning should opt-in explicitly by setting
ssl_version=ssl.PROTOCOL_TLSv1_1
(Pull #2002) Starting in urllib3 v2.0: Connections that receive aDeprecationWarning
will failDeprecated
Retry
optionsRetry.DEFAULT_METHOD_WHITELIST
,Retry.DEFAULT_REDIRECT_HEADERS_BLACKLIST
andRetry(method_whitelist=...)
in favor ofRetry.DEFAULT_ALLOWED_METHODS
,Retry.DEFAULT_REMOVE_HEADERS_ON_REDIRECT
, andRetry(allowed_methods=...)
(Pull #2000) Starting in urllib3 v2.0: Deprecated options will be removed
... (truncated)
Sourced from urllib3's changelog.
1.26.5 (2021-05-26)
- Fixed deprecation warnings emitted in Python 3.10.
- Updated vendored
six
library to 1.16.0.- Improved performance of URL parser when splitting the authority component.
1.26.4 (2021-03-15)
- Changed behavior of the default
SSLContext
when connecting to HTTPS proxy during HTTPS requests. The defaultSSLContext
now setscheck_hostname=True
.1.26.3 (2021-01-26)
Fixed bytes and string comparison issue with headers (Pull #2141)
Changed
ProxySchemeUnknown
error message to be more actionable if the user supplies a proxy URL without a scheme. (Pull #2107)1.26.2 (2020-11-12)
- Fixed an issue where
wrap_socket
andCERT_REQUIRED
wouldn't be imported properly on Python 2.7.8 and earlier (Pull #2052)1.26.1 (2020-11-11)
- Fixed an issue where two
User-Agent
headers would be sent if aUser-Agent
header key is passed asbytes
(Pull #2047)1.26.0 (2020-11-10)
NOTE: urllib3 v2.0 will drop support for Python 2.
Read more in the v2.0 Roadmap <https://urllib3.readthedocs.io/en/latest/v2-roadmap.html>
_.Added support for HTTPS proxies contacting HTTPS servers (Pull #1923, Pull #1806)
Deprecated negotiating TLSv1 and TLSv1.1 by default. Users that still wish to use TLS earlier than 1.2 without a deprecation warning
... (truncated)
d161647
Release 1.26.52d4a3fe
Improve performance of sub-authority splitting in URL2698537
Update vendored six to 1.16.007bed79
Fix deprecation warnings for Python 3.10 ssl moduled725a9b
Add Python 3.10 to GitHub Actions339ad34
Use pytest==6.2.4 on Python 3.10+f271c9c
Apply latest Black formatting1884878
[1.26] Properly proxy EOF on the SSLTransport test suitea891304
Release 1.26.48d65ea1
Merge pull request from GHSA-5phf-pp7p-vc2rDependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase
.
This release contains the initial batch of experiments - the ones included in the submitted for peer review manuscript. Attached you will find a zip file, containing the model weights, architectures, detailed training histories and performance visualizations for each experiment.
More information on the naming scheme and handling of this data is available on the experiments page.
deeplearning blockchain cryptocurrency prediction