Decentralized Reinforcment Learning: Global Decision-Making via Local Economic Transactions (ICML 2020)

mbchang, updated 🕥 2022-12-08 11:24:29

Decentralized Reinforcement Learning

MIT license

This is the code complementing the paper Decentralized Reinforcment Learning: Global Decision-Making via Local Economic Transactions by Michael Chang, Sid Kaushik, Matt Weinberg, Tom Griffiths, and Sergey Levine, accepted to the International Conference on Machine Learning, 2020.

Check out the accompanying blog post.

sdm

Setup

Set the PYTHONPATH: export PYTHONPATH=$PWD.

Create a conda environment with python version 3.6.

Install dependencies: pip install -r requirements.txt. This should also install babyai==0.1.0 from https://github.com/sidk99/babyai.git and gym-minigrid==1.0.1.

For the TwoRooms environment, comment out if self.step_count >= self.max_steps: done = True in gym_minigrid/minigrid.py in your gym-minigrid installation. By handling time-outs on the algorithm side rather than the environment side, we can treat the environment as an infinite-horizon problem. Otherwise, we'd have to put the time-step into the state to preserve the Markov property.

For GPU, set OMP_NUM_THREADS to 1: export OMP_NUM_THREADS=1.

Training

Run python runner.py --<experiment-name> to print out example commands for the environments in the paper. Add the --for-real flag to run those commands. You can enable parallel data collection with the --parallel_collect flag. You can also specify the gpu ids. As examples, in runner.py, the methods that launch bandit, chain, and duality do not use gpu while the others use gpu 0.

For the TwoRooms environment, you would need to pre-train the subpolicies first. Then you would need to specify the expriment folders for training the society using the pre-trained primitives. Instructions are in run_tworooms_pretrain_task and run_tworooms_transfer_task of runner.py.

Visualization

You can view the training curves in <exp_folder>/<seed_folder>/group_0/<env-name>_train/quantitative and you can view visualizations (for environments that have image observations) in <exp_folder>/<seed_folder>/group_0/<env-name>_test/qualitative.

Credits

The PPO update is based on this repo.

Issues

Bump certifi from 2020.6.20 to 2022.12.7

opened on 2022-12-08 11:24:29 by dependabot[bot]

Bumps certifi from 2020.6.20 to 2022.12.7.

Commits


Dependabot compatibility score

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


Dependabot commands and options
You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) - `@dependabot use these labels` will set the current labels as the default for future PRs for this repo and language - `@dependabot use these reviewers` will set the current reviewers as the default for future PRs for this repo and language - `@dependabot use these assignees` will set the current assignees as the default for future PRs for this repo and language - `@dependabot use this milestone` will set the current milestone as the default for future PRs for this repo and language You can disable automated security fix PRs for this repo from the [Security Alerts page](https://github.com/mbchang/decentralized-rl/network/alerts).

Bump pillow from 7.2.0 to 9.3.0

opened on 2022-11-22 06:38:26 by dependabot[bot]

Bumps pillow from 7.2.0 to 9.3.0.

Release notes

Sourced from pillow's releases.

9.3.0

https://pillow.readthedocs.io/en/stable/releasenotes/9.3.0.html

Changes

... (truncated)

Changelog

Sourced from pillow's changelog.

9.3.0 (2022-10-29)

  • Limit SAMPLESPERPIXEL to avoid runtime DOS #6700 [wiredfool]

  • Initialize libtiff buffer when saving #6699 [radarhere]

  • Inline fname2char to fix memory leak #6329 [nulano]

  • Fix memory leaks related to text features #6330 [nulano]

  • Use double quotes for version check on old CPython on Windows #6695 [hugovk]

  • Remove backup implementation of Round for Windows platforms #6693 [cgohlke]

  • Fixed set_variation_by_name offset #6445 [radarhere]

  • Fix malloc in _imagingft.c:font_setvaraxes #6690 [cgohlke]

  • Release Python GIL when converting images using matrix operations #6418 [hmaarrfk]

  • Added ExifTags enums #6630 [radarhere]

  • Do not modify previous frame when calculating delta in PNG #6683 [radarhere]

  • Added support for reading BMP images with RLE4 compression #6674 [npjg, radarhere]

  • Decode JPEG compressed BLP1 data in original mode #6678 [radarhere]

  • Added GPS TIFF tag info #6661 [radarhere]

  • Added conversion between RGB/RGBA/RGBX and LAB #6647 [radarhere]

  • Do not attempt normalization if mode is already normal #6644 [radarhere]

... (truncated)

Commits


Dependabot compatibility score

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


Dependabot commands and options
You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) - `@dependabot use these labels` will set the current labels as the default for future PRs for this repo and language - `@dependabot use these reviewers` will set the current reviewers as the default for future PRs for this repo and language - `@dependabot use these assignees` will set the current assignees as the default for future PRs for this repo and language - `@dependabot use this milestone` will set the current milestone as the default for future PRs for this repo and language You can disable automated security fix PRs for this repo from the [Security Alerts page](https://github.com/mbchang/decentralized-rl/network/alerts).

Bump ujson from 3.1.0 to 5.4.0

opened on 2022-07-05 21:54:55 by dependabot[bot]

Bumps ujson from 3.1.0 to 5.4.0.

Release notes

Sourced from ujson's releases.

5.4.0

Added

Fixed

5.3.0

Added

Changed

Fixed

5.2.0

Added

Fixed

5.1.0

Changed

... (truncated)

Commits
  • 9c20de0 Merge pull request from GHSA-fm67-cv37-96ff
  • b21da40 Fix double free on string decoding if realloc fails
  • 67ec071 Merge pull request #555 from JustAnotherArchivist/fix-decode-surrogates-2
  • bc7bdff Replace wchar_t string decoding implementation with a uint32_t-based one
  • cc70119 Merge pull request #548 from JustAnotherArchivist/arbitrary-ints
  • 4b5cccc Merge pull request #553 from bwoodsend/pypy-ci
  • abe26fc Merge pull request #551 from bwoodsend/bye-bye-travis
  • 3efb5cc Delete old TravisCI workflow and references.
  • 404de1a xfail test_decode_surrogate_characters() on Windows PyPy.
  • f7e66dc Switch to musl docker base images.
  • Additional commits viewable in compare view


Dependabot compatibility score

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


Dependabot commands and options
You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) - `@dependabot use these labels` will set the current labels as the default for future PRs for this repo and language - `@dependabot use these reviewers` will set the current reviewers as the default for future PRs for this repo and language - `@dependabot use these assignees` will set the current assignees as the default for future PRs for this repo and language - `@dependabot use this milestone` will set the current milestone as the default for future PRs for this repo and language You can disable automated security fix PRs for this repo from the [Security Alerts page](https://github.com/mbchang/decentralized-rl/network/alerts).

Bump numpy from 1.15.4 to 1.22.0

opened on 2022-06-22 02:39:45 by dependabot[bot]

Bumps numpy from 1.15.4 to 1.22.0.

Release notes

Sourced from numpy's releases.

v1.22.0

NumPy 1.22.0 Release Notes

NumPy 1.22.0 is a big release featuring the work of 153 contributors spread over 609 pull requests. There have been many improvements, highlights are:

  • Annotations of the main namespace are essentially complete. Upstream is a moving target, so there will likely be further improvements, but the major work is done. This is probably the most user visible enhancement in this release.
  • A preliminary version of the proposed Array-API is provided. This is a step in creating a standard collection of functions that can be used across application such as CuPy and JAX.
  • NumPy now has a DLPack backend. DLPack provides a common interchange format for array (tensor) data.
  • New methods for quantile, percentile, and related functions. The new methods provide a complete set of the methods commonly found in the literature.
  • A new configurable allocator for use by downstream projects.

These are in addition to the ongoing work to provide SIMD support for commonly used functions, improvements to F2PY, and better documentation.

The Python versions supported in this release are 3.8-3.10, Python 3.7 has been dropped. Note that 32 bit wheels are only provided for Python 3.8 and 3.9 on Windows, all other wheels are 64 bits on account of Ubuntu, Fedora, and other Linux distributions dropping 32 bit support. All 64 bit wheels are also linked with 64 bit integer OpenBLAS, which should fix the occasional problems encountered by folks using truly huge arrays.

Expired deprecations

Deprecated numeric style dtype strings have been removed

Using the strings "Bytes0", "Datetime64", "Str0", "Uint32", and "Uint64" as a dtype will now raise a TypeError.

(gh-19539)

Expired deprecations for loads, ndfromtxt, and mafromtxt in npyio

numpy.loads was deprecated in v1.15, with the recommendation that users use pickle.loads instead. ndfromtxt and mafromtxt were both deprecated in v1.17 - users should use numpy.genfromtxt instead with the appropriate value for the usemask parameter.

(gh-19615)

... (truncated)

Commits


Dependabot compatibility score

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


Dependabot commands and options
You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) - `@dependabot use these labels` will set the current labels as the default for future PRs for this repo and language - `@dependabot use these reviewers` will set the current reviewers as the default for future PRs for this repo and language - `@dependabot use these assignees` will set the current assignees as the default for future PRs for this repo and language - `@dependabot use this milestone` will set the current milestone as the default for future PRs for this repo and language You can disable automated security fix PRs for this repo from the [Security Alerts page](https://github.com/mbchang/decentralized-rl/network/alerts).

Bump ipython from 7.13.0 to 7.16.3

opened on 2022-01-21 20:34:55 by dependabot[bot]

Bumps ipython from 7.13.0 to 7.16.3.

Commits


Dependabot compatibility score

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


Dependabot commands and options
You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) - `@dependabot use these labels` will set the current labels as the default for future PRs for this repo and language - `@dependabot use these reviewers` will set the current reviewers as the default for future PRs for this repo and language - `@dependabot use these assignees` will set the current assignees as the default for future PRs for this repo and language - `@dependabot use this milestone` will set the current milestone as the default for future PRs for this repo and language You can disable automated security fix PRs for this repo from the [Security Alerts page](https://github.com/mbchang/decentralized-rl/network/alerts).

Bump pygments from 2.6.1 to 2.7.4

opened on 2021-03-30 01:10:31 by dependabot[bot]

Bumps pygments from 2.6.1 to 2.7.4.

Release notes

Sourced from pygments's releases.

2.7.4

  • Updated lexers:

    • Apache configurations: Improve handling of malformed tags (#1656)

    • CSS: Add support for variables (#1633, #1666)

    • Crystal (#1650, #1670)

    • Coq (#1648)

    • Fortran: Add missing keywords (#1635, #1665)

    • Ini (#1624)

    • JavaScript and variants (#1647 -- missing regex flags, #1651)

    • Markdown (#1623, #1617)

    • Shell

      • Lex trailing whitespace as part of the prompt (#1645)
      • Add missing in keyword (#1652)
    • SQL - Fix keywords (#1668)

    • Typescript: Fix incorrect punctuation handling (#1510, #1511)

  • Fix infinite loop in SML lexer (#1625)

  • Fix backtracking string regexes in JavaScript/TypeScript, Modula2 and many other lexers (#1637)

  • Limit recursion with nesting Ruby heredocs (#1638)

  • Fix a few inefficient regexes for guessing lexers

  • Fix the raw token lexer handling of Unicode (#1616)

  • Revert a private API change in the HTML formatter (#1655) -- please note that private APIs remain subject to change!

  • Fix several exponential/cubic-complexity regexes found by Ben Caller/Doyensec (#1675)

  • Fix incorrect MATLAB example (#1582)

Thanks to Google's OSS-Fuzz project for finding many of these bugs.

2.7.3

... (truncated)

Changelog

Sourced from pygments's changelog.

Version 2.7.4

(released January 12, 2021)

  • Updated lexers:

    • Apache configurations: Improve handling of malformed tags (#1656)

    • CSS: Add support for variables (#1633, #1666)

    • Crystal (#1650, #1670)

    • Coq (#1648)

    • Fortran: Add missing keywords (#1635, #1665)

    • Ini (#1624)

    • JavaScript and variants (#1647 -- missing regex flags, #1651)

    • Markdown (#1623, #1617)

    • Shell

      • Lex trailing whitespace as part of the prompt (#1645)
      • Add missing in keyword (#1652)
    • SQL - Fix keywords (#1668)

    • Typescript: Fix incorrect punctuation handling (#1510, #1511)

  • Fix infinite loop in SML lexer (#1625)

  • Fix backtracking string regexes in JavaScript/TypeScript, Modula2 and many other lexers (#1637)

  • Limit recursion with nesting Ruby heredocs (#1638)

  • Fix a few inefficient regexes for guessing lexers

  • Fix the raw token lexer handling of Unicode (#1616)

  • Revert a private API change in the HTML formatter (#1655) -- please note that private APIs remain subject to change!

  • Fix several exponential/cubic-complexity regexes found by Ben Caller/Doyensec (#1675)

  • Fix incorrect MATLAB example (#1582)

Thanks to Google's OSS-Fuzz project for finding many of these bugs.

Version 2.7.3

(released December 6, 2020)

... (truncated)

Commits
  • 4d555d0 Bump version to 2.7.4.
  • fc3b05d Update CHANGES.
  • ad21935 Revert "Added dracula theme style (#1636)"
  • e411506 Prepare for 2.7.4 release.
  • 275e34d doc: remove Perl 6 ref
  • 2e7e8c4 Fix several exponential/cubic complexity regexes found by Ben Caller/Doyensec
  • eb39c43 xquery: fix pop from empty stack
  • 2738778 fix coding style in test_analyzer_lexer
  • 02e0f09 Added 'ERROR STOP' to fortran.py keywords. (#1665)
  • c83fe48 support added for css variables (#1633)
  • Additional commits viewable in compare view


Dependabot compatibility score

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


Dependabot commands and options
You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) - `@dependabot use these labels` will set the current labels as the default for future PRs for this repo and language - `@dependabot use these reviewers` will set the current reviewers as the default for future PRs for this repo and language - `@dependabot use these assignees` will set the current assignees as the default for future PRs for this repo and language - `@dependabot use this milestone` will set the current milestone as the default for future PRs for this repo and language You can disable automated security fix PRs for this repo from the [Security Alerts page](https://github.com/mbchang/decentralized-rl/network/alerts).

Releases

Initial Release 2020-08-16 23:55:19

mbchang

C.S. Ph.D. student at UC Berkeley

GitHub Repository Homepage

mechanism-design machine-learning deep-reinforcement-learning pytorch artificial-intelligence deep-learning