Dataset generator for the realistic blocksworld environment

IBM, updated 🕥 2022-09-14 17:57:51

Photo-Realistic Blocksworld Build Status

NEWS: .npz binaries are available from .

NEWS: Now works on blender 2.80+

NEWS: Random state generation (each state is randomized). Previously, a state is randomized at the beginning, but all states have the same objects (colors, shapes). Also, this version does not assume placing objects on a certain grid. To retrieve the original version (generated from a fixed environment), check out v1 tag in git.

This is a repository modified from the CLEVR dataset for generating realistic visualizations of blocksworld.


With anaconda,

conda env create -f environment.yml conda activate prb

Install blender:

wget tar xf blender-2.83.2-linux64.tar.xz echo $PWD > $(echo blender*/2.*/python/lib/python*/site-packages/)clevr.pth

Example: Run ./

For the original readme, see .

Note: I changed all keyword options from using underscores to using hyphens (e.g. --use_gpu -> --use-gpu).


  • :

Renders random scenes using Blender and stores them into a result directory. The directory contains images and metadata. This file must be run in the python environment shipped with Blender.

  • :

Renders two blocksworld states separated by a specified number of steps (--num-steps). We generate the initial state randomly and perform a random walk to generate the goal state. This file must be run in the python environment shipped with Blender.

  • :

For a result directory produced by, it extracts the regions from the every images generated, resize them to 32x32 and store them in a .npz container along with the bounding box vector (x1,y1,x2,y2). Optionally, --include-background option resizes and stores the entire image into a .npz container in the same format. In order to have the same format, all bounding boxes have (0,0,xmax,ymax) values and the number of objects is 1. Optionally, --exclude-objects option disables region extraction. When combined with --include-background, the resulting archive is merely a compact, resized image format. See other options from the source scripts or by runnign the script with no arguments. This file must be run in the conda environment.

  • :

Orchestrate several scripts to generate the whole dataset. For its usage, check the script itself

  • :

When generate_all was invoked on distributed environment, this results in multiple npz files in the same format. You can generate several environments, each of which has the different block shape, color etc. This script takes several result npz files of such runs and concatenate them into a single npz file.

  • :

Generates a specified number of random problem instances. For its usage, check the script itself

Data format

The npz files generated by can be loaded into numpy as follows and contains the following fields:

``` python import numpy as np with np.load("path/to/blocks-3-3.npz") as data: images = data["images"] # uint8 array of shape [state_id, object_id, patch_size, patch_size]. # it contains the image patch for each object in the environment. # patch_size is 32 by default.

bboxes      = data["bboxes"]      # uint16 array of shape [state_id, object_id, 4].
                                  # it contains the bounding box of each image patch in [x1,y1,x2,y2] format.

picsize     = data["picsize"]     # an int vector of 2 elements, [maxY,maxX], containing the original image size.



To generate a dataset of 200 transitions with 3 blocks:

./ 3 200

To generate a large dataset with 5 blocks / 3 stacks (>80k states=images), running it on a single computer would take a long time. If you have access to a compute cluster, you can distribute the workload to the job scheduler. You should customize the job submission command in for your job scheduler (e.g., Torque/PBS, Sun Grid Engine). Once customized, running the script like this will submit 4 jobs where 1000 images are distributed to each job (250 images each).

./ 5 1000 4

Further customization details are available in the comment section of the script.


bibtex @article{asai2018blocksworld, author = {Asai, Masataro}, journal = {arXiv preprint arXiv:1812.01818}, title = {{Photo-Realistic Blocksworld Dataset}}, year = {2018} }

Relevant citations:

bibtex @article{asai2018perminv, author = {Asai, Masataro}, journal = {arXiv preprint arXiv:1812.01217}, title = {{Set Cross Entropy: Likelihood-based Permutation Invariant Loss Function for Probability Distributions}}, year = {2018} }

bibtex @inproceedings{asai2019unsupervised, title={Unsupervised grounding of plannable first-order logic representation from images}, author={Asai, Masataro}, booktitle={Proceedings of the International Conference on Automated Planning and Scheduling}, volume={29}, pages={583--591}, year={2019} }


action label for transition

opened on 2020-05-18 00:53:57 by sontung

Hi, Thanks for the dataset. One question is "how do I get the action for each of the recorded transitions?".


3 blocks, 3 towers, multiple colors and shapes, with backgrounds 2020-09-04 17:19:17

this archive contains the additional "global" object that is just a 32x32 compressed image of the entire scene. The number of objects in each state is thus 4. The bounding boxes for the global object is ~[0,0, ymax, xmax].~ [0,0, xmax, ymax].

3 blocks, 3 towers, multiple colors and shapes 2020-08-26 17:41:26

This data contains 100 variations of 3 blocks 3 towers dataset in a single file, where within each variation the objects are assigned unique colors and shapes which are selected randomly.

There are 8 potential colors, 2 sizes and 3 shapes (cube, cylinder, sphere), thus each object has 48 potential configurations in itself. Therefore there are 48x47x46=103776 total object combinations, 100 of which are included in this dataset.

Edit: The previous release (-invalid) contained the state ids which were not appropriately offset for the additional data.

5 blocks, 3 towers dataset 2019-04-24 07:48:37


4 blocks, 4 towers dataset 2019-04-24 07:34:29

(updated due to overlapping colors) CLEVR_new_000000

3 blocks, 7 towers dataset 2019-04-24 07:34:29

CLEVR_new_000000 (updated due to overlapping colors)

3 blocks, 6 towers dataset 2019-04-24 07:34:29


International Business Machines
GitHub Repository