NEWS: .npz
binaries are available from https://github.com/IBM/photorealistic-blocksworld/releases .
NEWS: Now works on blender 2.80+
NEWS: Random state generation (each state is randomized). Previously, a state is randomized at the beginning, but all states
have the same objects (colors, shapes). Also, this version does not assume placing objects on a certain grid.
To retrieve the original version (generated from a fixed environment), check out v1
tag in git.
This is a repository modified from the CLEVR dataset for generating realistic visualizations of blocksworld.
Setup:
With anaconda,
conda env create -f environment.yml
conda activate prb
Install blender:
wget https://download.blender.org/release/Blender2.83/blender-2.83.2-linux64.tar.xz
tar xf blender-2.83.2-linux64.tar.xz
echo $PWD > $(echo blender*/2.*/python/lib/python*/site-packages/)clevr.pth
Example: Run ./test.sh
.
For the original readme, see README-clevr.md .
Note: I changed all keyword options from using underscores to using hyphens (e.g. --use_gpu
-> --use-gpu
).
render_images.py
: Renders random scenes using Blender and stores them into a result directory. The directory contains images and metadata. This file must be run in the python environment shipped with Blender.
render_problem.py
: Renders two blocksworld states separated by a specified number of steps (--num-steps
).
We generate the initial state randomly and perform a random walk to generate the goal state.
This file must be run in the python environment shipped with Blender.
extract_all_regions_binary.py
:For a result directory produced by render_images.py
,
it extracts the regions from the every images generated, resize them to 32x32 and
store them in a .npz
container along with the bounding box vector (x1,y1,x2,y2).
Optionally, --include-background option resizes and stores the entire image into a .npz
container in the same format.
In order to have the same format, all bounding boxes have (0,0,xmax,ymax) values and the number of objects is 1.
Optionally, --exclude-objects option disables region extraction. When combined with --include-background,
the resulting archive is merely a compact, resized image format.
See other options from the source scripts or by runnign the script with no arguments.
This file must be run in the conda environment.
generate_all.sh
:Orchestrate several scripts to generate the whole dataset. For its usage, check the script itself
merge-npz.py
:When generate_all
was invoked on distributed environment, this results in multiple npz files in the same format.
You can generate several environments, each of which has the different block shape, color etc.
This script takes several result npz files of such runs and concatenate them into a single npz file.
generate_problems.sh
:Generates a specified number of random problem instances. For its usage, check the script itself
The npz files generated by extract_all_regions_binary.py
can be loaded into
numpy as follows and contains the following fields:
``` python import numpy as np with np.load("path/to/blocks-3-3.npz") as data: images = data["images"] # uint8 array of shape [state_id, object_id, patch_size, patch_size]. # it contains the image patch for each object in the environment. # patch_size is 32 by default.
bboxes = data["bboxes"] # uint16 array of shape [state_id, object_id, 4].
# it contains the bounding box of each image patch in [x1,y1,x2,y2] format.
picsize = data["picsize"] # an int vector of 2 elements, [maxY,maxX], containing the original image size.
```
To generate a dataset of 200 transitions with 3 blocks:
./generate_all.sh 3 200
To generate a large dataset with 5 blocks / 3 stacks (>80k states=images),
running it on a single computer would take a long time.
If you have access to a compute cluster, you can distribute the workload
to the job scheduler.
You should customize the job submission command in generate_all.sh
for your job scheduler (e.g., Torque/PBS, Sun Grid Engine).
Once customized, running the script like this will submit 4 jobs where 1000 images are distributed to each job (250 images each).
./generate_all.sh 5 1000 4
Further customization details are available in the comment section of the script.
bibtex
@article{asai2018blocksworld,
author = {Asai, Masataro},
journal = {arXiv preprint arXiv:1812.01818},
title = {{Photo-Realistic Blocksworld Dataset}},
year = {2018}
}
Relevant citations:
bibtex
@article{asai2018perminv,
author = {Asai, Masataro},
journal = {arXiv preprint arXiv:1812.01217},
title = {{Set Cross Entropy: Likelihood-based Permutation Invariant Loss Function for Probability Distributions}},
year = {2018}
}
bibtex
@inproceedings{asai2019unsupervised,
title={Unsupervised grounding of plannable first-order logic representation from images},
author={Asai, Masataro},
booktitle={Proceedings of the International Conference on Automated Planning and Scheduling},
volume={29},
pages={583--591},
year={2019}
}
Hi, Thanks for the dataset. One question is "how do I get the action for each of the recorded transitions?".
this archive contains the additional "global" object that is just a 32x32 compressed image of the entire scene. The number of objects in each state is thus 4. The bounding boxes for the global object is ~[0,0, ymax, xmax].~ [0,0, xmax, ymax].
This data contains 100 variations of 3 blocks 3 towers dataset in a single file, where within each variation the objects are assigned unique colors and shapes which are selected randomly.
There are 8 potential colors, 2 sizes and 3 shapes (cube, cylinder, sphere), thus each object has 48 potential configurations in itself. Therefore there are 48x47x46=103776 total object combinations, 100 of which are included in this dataset.
Edit: The previous release (-invalid) contained the state ids which were not appropriately offset for the additional data.
(updated due to overlapping colors)
(updated due to overlapping colors)