OpenAI Gym environments for various twisty puzzles

DoubleGremlin181, updated 🕥 2022-01-21 07:39:47

RubiksCubeGym

An OpenAI Gym environment for various twisty puzzles.

GitHub Workflow Status PyPI PyPI - Wheel PyPI - License PyPI - Downloads

Currently available environments:

  • [x] 2x2x2 Pocket Rubik's Cube
  • [x] Pyraminx
  • [x] Skewb

Citation

@article{hukmani2021solving, title={Solving Twisty Puzzles Using Parallel Q-learning.}, author={Hukmani, Kavish and Kolekar, Sucheta and Vobugari, Sreekumar}, journal={Engineering Letters}, volume={29}, number={4}, year={2021} }

Details:

2x2x2 Pocket Rubik's Cube

Mapping of tiles | | | |--|--| | Action Space | Discrete(3) | | Observation Space| Discrete(3674160) | | Actions | F, R, U |
| Rewards | (-inf, 100] | | Max steps | 250 | | Reward Types | Base, Layer By Layer Method, Ortega Method | | Render Modes | 'human', 'rgb_array', 'ansi' |

Pyraminx without tips

Mapping of tiles | | | |--|--| | Action Space | Discrete(4) | | Observation Space| Discrete(933120) | | Actions | L, R, U, B |
| Rewards | (-inf, 100] | | Max steps | 250 | | Reward Types | Base, Layer by Layer Method | | Render Modes | 'human', 'rgb_array', 'ansi' |

Skewb

Mapping of tiles | | | |--|--| | Action Space | Discrete(4) | | Observation Space| Discrete(3149280) | | Actions | L, R, U, B |
| Rewards | (-inf, 100] | | Max steps | 250 | | Reward Types | Base, Sarah's Method(Advanced) | | Render Modes | 'human', 'rgb_array', 'ansi' |

Installation

Via PyPI

pip install rubiks-cube-gym

Or build from source

git clone https://github.com/DoubleGremlin181/RubiksCubeGym.git
cd RubiksCubeGym
pip install -e .

Requirements

  • gym
  • numpy
  • opencv-python
  • wget

Scrambling

You can pass the scramble as a parameter for the reset function self.reset(scramble="R U R' U'")

The scramble should follow the WCA Notation

Example

import gym  
import rubiks_cube_gym

env = gym.make('rubiks-cube-222-lbl-v0')  
env.reset(scramble="R U R' U' R' F R2 U' R' U' R U R' F'")

for _ in range(4):  
    env.render()  
    print(env.step(1))  
env.render(render_time=0)  
env.close()


(3178426, -26, False, {'cube': array([ 0,  9,  2, 15,  4,  5,  6, 21, 16, 10,  1, 11, 12, 13, 14, 23, 17, 7,  3, 19, 20, 18, 22,  8], dtype=uint8), 'cube_reduced': 'WRWGOOGYRBWBOOGYRGWBYBYR'})
(1542962, -1, False, {'cube': array([ 0, 21,  2, 23,  4,  5,  6, 18, 17, 16, 15, 11, 12, 13, 14,  8,  7, 10,  9, 19, 20,  3, 22,  1], dtype=uint8), 'cube_reduced': 'WYWYOOGBRRGBOOGRGBRBYWYW'})
(1682970, -1, False, {'cube': array([ 0, 18,  2,  8,  4,  5,  6,  3,  7, 17, 23, 11, 12, 13, 14,  1, 10, 16, 21, 19, 20,  9, 22, 15], dtype=uint8), 'cube_reduced': 'WBWROOGWGRYBOOGWBRYBYRYG'})
(2220193, 25, False, {'cube': array([ 0,  3,  2,  1,  4,  5,  6,  9, 10,  7,  8, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23], dtype=uint8), 'cube_reduced': 'WWWWOOGRBGRBOOGGRRBBYYYY'})

Output

You can find my implementation and results using Parallel Q-learning here.

Releases

Version 0.4.0 2020-07-13 06:19:13

Changed reward scheme for Pyraminx LBL Method and Skewb Sarah's Method to avoid the reward for subgoals(after discount) being greater than the solved puzzle

Version 0.3.1 2020-07-02 09:23:23

  • Added Sarah's Method(Advanced) for the Skewb
  • Added Layer by Layer for the Pyraminx
  • Minor formatting changes

Version 0.2.1 2020-06-19 13:19:19

  • Added Skewb
  • Added option to set scramble to False
  • Minor formatting fixes

Version 0.1.1 2020-06-17 07:42:15

Increased scramble length to resemble official WCA scrambles

Version 0.1.0 2020-06-15 12:27:20

  • Added Pyraminx without tips
  • Organized README images

Version 0.0.7 2020-06-12 08:58:55

Optimized cube moves

Kavish Hukmani

Data Scientist | Open Source Enthusiast

GitHub Repository

openai-gym rubiks-cube-simulator reinforcement-learning-environments