This is the code complementing the paper Decentralized Reinforcment Learning: Global Decision-Making via Local Economic Transactions by Michael Chang, Sid Kaushik, Matt Weinberg, Tom Griffiths, and Sergey Levine, accepted to the International Conference on Machine Learning, 2020.
Check out the accompanying blog post.
Set the PYTHONPATH: export PYTHONPATH=$PWD
.
Create a conda environment with python version 3.6.
Install dependencies: pip install -r requirements.txt
. This should also install babyai==0.1.0
from https://github.com/sidk99/babyai.git
and gym-minigrid==1.0.1
.
For the TwoRooms environment, comment out
if self.step_count >= self.max_steps:
done = True
in gym_minigrid/minigrid.py
in your gym-minigrid installation. By handling time-outs on the algorithm side rather than the environment side, we can treat the environment as an infinite-horizon problem. Otherwise, we'd have to put the time-step into the state to preserve the Markov property.
For GPU, set OMP_NUM_THREADS to 1: export OMP_NUM_THREADS=1
.
Run python runner.py --<experiment-name>
to print out example commands for the environments in the paper. Add the --for-real
flag to run those commands. You can enable parallel data collection with the --parallel_collect
flag. You can also specify the gpu ids. As examples, in runner.py
, the methods that launch bandit
, chain
, and duality
do not use gpu while the others use gpu 0.
For the TwoRooms environment, you would need to pre-train the subpolicies first. Then you would need to specify the expriment folders for training the society using the pre-trained primitives. Instructions are in run_tworooms_pretrain_task
and run_tworooms_transfer_task
of runner.py
.
You can view the training curves in <exp_folder>/<seed_folder>/group_0/<env-name>_train/quantitative
and you can view visualizations (for environments that have image observations) in <exp_folder>/<seed_folder>/group_0/<env-name>_test/qualitative
.
The PPO update is based on this repo.
Bumps certifi from 2020.6.20 to 2022.12.7.
9e9e840
2022.12.07b81bdb2
2022.09.24939a28f
2022.09.14aca828a
2022.06.15.2de0eae1
Only use importlib.resources's new files() / Traversable API on Python ≥3.11 ...b8eb5e9
2022.06.15.147fb7ab
Fix deprecation warning on Python 3.11 (#199)b0b48e0
fixes #198 -- update link in license9d514b4
2022.06.154151e88
Add py.typed to MANIFEST.in to package in sdist (#196)Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase
.
Bumps pillow from 7.2.0 to 9.3.0.
Sourced from pillow's releases.
9.3.0
https://pillow.readthedocs.io/en/stable/releasenotes/9.3.0.html
Changes
- Initialize libtiff buffer when saving #6699 [
@radarhere
]- Limit SAMPLESPERPIXEL to avoid runtime DOS #6700 [
@wiredfool
]- Inline fname2char to fix memory leak #6329 [
@nulano
]- Fix memory leaks related to text features #6330 [
@nulano
]- Use double quotes for version check on old CPython on Windows #6695 [
@hugovk
]- GHA: replace deprecated set-output command with GITHUB_OUTPUT file #6697 [
@nulano
]- Remove backup implementation of Round for Windows platforms #6693 [
@cgohlke
]- Upload fribidi.dll to GitHub Actions #6532 [
@nulano
]- Fixed set_variation_by_name offset #6445 [
@radarhere
]- Windows build improvements #6562 [
@nulano
]- Fix malloc in _imagingft.c:font_setvaraxes #6690 [
@cgohlke
]- Only use ASCII characters in C source file #6691 [
@cgohlke
]- Release Python GIL when converting images using matrix operations #6418 [
@hmaarrfk
]- Added ExifTags enums #6630 [
@radarhere
]- Do not modify previous frame when calculating delta in PNG #6683 [
@radarhere
]- Added support for reading BMP images with RLE4 compression #6674 [
@npjg
]- Decode JPEG compressed BLP1 data in original mode #6678 [
@radarhere
]- pylint warnings #6659 [
@marksmayo
]- Added GPS TIFF tag info #6661 [
@radarhere
]- Added conversion between RGB/RGBA/RGBX and LAB #6647 [
@radarhere
]- Do not attempt normalization if mode is already normal #6644 [
@radarhere
]- Fixed seeking to an L frame in a GIF #6576 [
@radarhere
]- Consider all frames when selecting mode for PNG save_all #6610 [
@radarhere
]- Don't reassign crc on ChunkStream close #6627 [
@radarhere
]- Raise a warning if NumPy failed to raise an error during conversion #6594 [
@radarhere
]- Only read a maximum of 100 bytes at a time in IMT header #6623 [
@radarhere
]- Show all frames in ImageShow #6611 [
@radarhere
]- Allow FLI palette chunk to not be first #6626 [
@radarhere
]- If first GIF frame has transparency for RGB_ALWAYS loading strategy, use RGBA mode #6592 [
@radarhere
]- Round box position to integer when pasting embedded color #6517 [
@radarhere
]- Removed EXIF prefix when saving WebP #6582 [
@radarhere
]- Pad IM palette to 768 bytes when saving #6579 [
@radarhere
]- Added DDS BC6H reading #6449 [
@ShadelessFox
]- Added support for opening WhiteIsZero 16-bit integer TIFF images #6642 [
@JayWiz
]- Raise an error when allocating translucent color to RGB palette #6654 [
@jsbueno
]- Moved mode check outside of loops #6650 [
@radarhere
]- Added reading of TIFF child images #6569 [
@radarhere
]- Improved ImageOps palette handling #6596 [
@PososikTeam
]- Defer parsing of palette into colors #6567 [
@radarhere
]- Apply transparency to P images in ImageTk.PhotoImage #6559 [
@radarhere
]- Use rounding in ImageOps contain() and pad() #6522 [
@bibinhashley
]- Fixed GIF remapping to palette with duplicate entries #6548 [
@radarhere
]- Allow remap_palette() to return an image with less than 256 palette entries #6543 [
@radarhere
]- Corrected BMP and TGA palette size when saving #6500 [
@radarhere
]
... (truncated)
Sourced from pillow's changelog.
9.3.0 (2022-10-29)
Limit SAMPLESPERPIXEL to avoid runtime DOS #6700 [wiredfool]
Initialize libtiff buffer when saving #6699 [radarhere]
Inline fname2char to fix memory leak #6329 [nulano]
Fix memory leaks related to text features #6330 [nulano]
Use double quotes for version check on old CPython on Windows #6695 [hugovk]
Remove backup implementation of Round for Windows platforms #6693 [cgohlke]
Fixed set_variation_by_name offset #6445 [radarhere]
Fix malloc in _imagingft.c:font_setvaraxes #6690 [cgohlke]
Release Python GIL when converting images using matrix operations #6418 [hmaarrfk]
Added ExifTags enums #6630 [radarhere]
Do not modify previous frame when calculating delta in PNG #6683 [radarhere]
Added support for reading BMP images with RLE4 compression #6674 [npjg, radarhere]
Decode JPEG compressed BLP1 data in original mode #6678 [radarhere]
Added GPS TIFF tag info #6661 [radarhere]
Added conversion between RGB/RGBA/RGBX and LAB #6647 [radarhere]
Do not attempt normalization if mode is already normal #6644 [radarhere]
... (truncated)
d594f4c
Update CHANGES.rst [ci skip]909dc64
9.3.0 version bump1a51ce7
Merge pull request #6699 from hugovk/security-libtiff_buffer2444cdd
Merge pull request #6700 from hugovk/security-samples_per_pixel-sec744f455
Added release notes0846bfa
Add to release notes799a6a0
Fix linting00b25fd
Hide UserWarning in logs05b175e
Tighter test case13f2c5a
Prevent DOS with large SAMPLESPERPIXEL in Tiff IFDDependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase
.
Bumps ujson from 3.1.0 to 5.4.0.
Sourced from ujson's releases.
5.4.0
Added
- Add support for arbitrary size integers (#548)
@JustAnotherArchivist
Fixed
- CVE-2022-31116:
- Replace
wchar_t
string decoding implementation with auint32_t
-based one (#555)@JustAnotherArchivist
- Fix handling of surrogates on decoding (#550)
@JustAnotherArchivist
- CVE-2022-31117: Potential double free of buffer during string decoding
@JustAnotherArchivist
- Fix memory leak on encoding errors when the buffer was resized (#549)
@JustAnotherArchivist
- Integer parsing: always detect overflows (#544)
@NaN-git
- Fix handling of surrogates on encoding (#530)
@JustAnotherArchivist
5.3.0
Added
Changed
- Benchmark refactor - argparse CLI (#533)
@Erotemic
Fixed
- Fix segmentation faults when errors occur while handling unserialisable objects (#531)
@JustAnotherArchivist
- Fix segmentation fault when an exception is raised while converting a dict key to a string (#526)
@JustAnotherArchivist
- Fix memory leak dumping on non-string dict keys (#521)
@JustAnotherArchivist
- Fix ref counting on repeated default function calls (#524)
@JustAnotherArchivist
- Remove redundant
wheel
dependency frompyproject.toml
(#535)@hugovk
5.2.0
Added
- Support parsing NaN, Infinity and -Infinity (#514)
@Erotemic
- Support dynamically linking against system double-conversion library (#508)
@musicinmybrain
- Add env var to control stripping debug info (#507)
@musicinmybrain
- Add
JSONDecodeError
(#498)@JustAnotherArchivist
Fixed
- Fix buffer overflows (CVE-2021-45958) (#519)
@JustAnotherArchivist
- Upgrade Black to fix Click (#515)
@hugovk
- simplify exception handling on integer overflow (#510)
@RouquinBlanc
- Remove dead code that used to handle the separate int type in Python 2 (#509)
@JustAnotherArchivist
- Fix exceptions on encoding list or dict elements and non-overflow errors on int handling getting silenced (#505)
@JustAnotherArchivist
5.1.0
Changed
... (truncated)
9c20de0
Merge pull request from GHSA-fm67-cv37-96ffb21da40
Fix double free on string decoding if realloc fails67ec071
Merge pull request #555 from JustAnotherArchivist/fix-decode-surrogates-2bc7bdff
Replace wchar_t string decoding implementation with a uint32_t-based onecc70119
Merge pull request #548 from JustAnotherArchivist/arbitrary-ints4b5cccc
Merge pull request #553 from bwoodsend/pypy-ciabe26fc
Merge pull request #551 from bwoodsend/bye-bye-travis3efb5cc
Delete old TravisCI workflow and references.404de1a
xfail test_decode_surrogate_characters() on Windows PyPy.f7e66dc
Switch to musl docker base images.Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase
.
Bumps numpy from 1.15.4 to 1.22.0.
Sourced from numpy's releases.
v1.22.0
NumPy 1.22.0 Release Notes
NumPy 1.22.0 is a big release featuring the work of 153 contributors spread over 609 pull requests. There have been many improvements, highlights are:
- Annotations of the main namespace are essentially complete. Upstream is a moving target, so there will likely be further improvements, but the major work is done. This is probably the most user visible enhancement in this release.
- A preliminary version of the proposed Array-API is provided. This is a step in creating a standard collection of functions that can be used across application such as CuPy and JAX.
- NumPy now has a DLPack backend. DLPack provides a common interchange format for array (tensor) data.
- New methods for
quantile
,percentile
, and related functions. The new methods provide a complete set of the methods commonly found in the literature.- A new configurable allocator for use by downstream projects.
These are in addition to the ongoing work to provide SIMD support for commonly used functions, improvements to F2PY, and better documentation.
The Python versions supported in this release are 3.8-3.10, Python 3.7 has been dropped. Note that 32 bit wheels are only provided for Python 3.8 and 3.9 on Windows, all other wheels are 64 bits on account of Ubuntu, Fedora, and other Linux distributions dropping 32 bit support. All 64 bit wheels are also linked with 64 bit integer OpenBLAS, which should fix the occasional problems encountered by folks using truly huge arrays.
Expired deprecations
Deprecated numeric style dtype strings have been removed
Using the strings
"Bytes0"
,"Datetime64"
,"Str0"
,"Uint32"
, and"Uint64"
as a dtype will now raise aTypeError
.(gh-19539)
Expired deprecations for
loads
,ndfromtxt
, andmafromtxt
in npyio
numpy.loads
was deprecated in v1.15, with the recommendation that users usepickle.loads
instead.ndfromtxt
andmafromtxt
were both deprecated in v1.17 - users should usenumpy.genfromtxt
instead with the appropriate value for theusemask
parameter.(gh-19615)
... (truncated)
4adc87d
Merge pull request #20685 from charris/prepare-for-1.22.0-releasefd66547
REL: Prepare for the NumPy 1.22.0 release.125304b
wipc283859
Merge pull request #20682 from charris/backport-204165399c03
Merge pull request #20681 from charris/backport-20954f9c45f8
Merge pull request #20680 from charris/backport-20663794b36f
Update armccompiler.pyd93b14e
Update test_public_api.py7662c07
Update init.py311ab52
Update armccompiler.pyDependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase
.
Bumps ipython from 7.13.0 to 7.16.3.
d43c7c7
release 7.16.35fa1e40
Merge pull request from GHSA-pq7m-3gw7-gq5x8df8971
back to dev9f477b7
release 7.16.2138f266
bring back release helper from master branch5aa3634
Merge pull request #13341 from meeseeksmachine/auto-backport-of-pr-13335-on-7...bcae8e0
Backport PR #13335: What's new 7.16.28fcdcd3
Pin Jedi to <0.17.2.2486838
release 7.16.120bdc6f
fix conda buildDependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase
.
Bumps pygments from 2.6.1 to 2.7.4.
Sourced from pygments's releases.
2.7.4
Updated lexers:
Fix infinite loop in SML lexer (#1625)
Fix backtracking string regexes in JavaScript/TypeScript, Modula2 and many other lexers (#1637)
Limit recursion with nesting Ruby heredocs (#1638)
Fix a few inefficient regexes for guessing lexers
Fix the raw token lexer handling of Unicode (#1616)
Revert a private API change in the HTML formatter (#1655) -- please note that private APIs remain subject to change!
Fix several exponential/cubic-complexity regexes found by Ben Caller/Doyensec (#1675)
Fix incorrect MATLAB example (#1582)
Thanks to Google's OSS-Fuzz project for finding many of these bugs.
2.7.3
... (truncated)
Sourced from pygments's changelog.
Version 2.7.4
(released January 12, 2021)
Updated lexers:
Fix infinite loop in SML lexer (#1625)
Fix backtracking string regexes in JavaScript/TypeScript, Modula2 and many other lexers (#1637)
Limit recursion with nesting Ruby heredocs (#1638)
Fix a few inefficient regexes for guessing lexers
Fix the raw token lexer handling of Unicode (#1616)
Revert a private API change in the HTML formatter (#1655) -- please note that private APIs remain subject to change!
Fix several exponential/cubic-complexity regexes found by Ben Caller/Doyensec (#1675)
Fix incorrect MATLAB example (#1582)
Thanks to Google's OSS-Fuzz project for finding many of these bugs.
Version 2.7.3
(released December 6, 2020)
... (truncated)
4d555d0
Bump version to 2.7.4.fc3b05d
Update CHANGES.ad21935
Revert "Added dracula theme style (#1636)"e411506
Prepare for 2.7.4 release.275e34d
doc: remove Perl 6 ref2e7e8c4
Fix several exponential/cubic complexity regexes found by Ben Caller/Doyenseceb39c43
xquery: fix pop from empty stack2738778
fix coding style in test_analyzer_lexer02e0f09
Added 'ERROR STOP' to fortran.py keywords. (#1665)c83fe48
support added for css variables (#1633)Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase
.
mechanism-design machine-learning deep-reinforcement-learning pytorch artificial-intelligence deep-learning