This is the main repository for * Baptista de Castro, P., Terashima, K., Esparza Echevarria, M.G., Takeya, H. and Takano, Y. (2022), XERUS: An Open-Source Tool for Quick XRD Phase Identification and Refinement Automation. Adv. Theory Simul. 2100588. https://doi.org/10.1002/adts.202100588
For the Xerus version that was published in the paper, please refer to release 1.0r1 here
Welcome to the Xerus project. Xerus is an open-source python wrapper / plugin around the GSASII Scriptable package, for automatization of phase quantification and Rietveld analysis by combining similarity calculations of simulated patterns (Pearson´s) with quick Rietveld refinements.
Xerus is only possible due to the existence of the following projects:
* COD (Crystallographic Open Database)
* The Materials Project (MP)
* AFLOW Database
* OQMD (Open Quantum Materials Database)
* GSASII Scriptable Engine
* pymatgen
Xerus is designed to perform analysis through Jupyter notebooks, providing an easy to use API from phase quantification to Rietveld optimization
The main mechanisms behind Xerus stems from the following papers:
Clustering of XRD patterns using similarity:
Optmization of Rietveld refinements:
Plus our own pattern removal iterative process coupled with quick rietveld refinements that allows for multiphase characterization.
In this section we will briefly introduce how to install Xerus in the easiest manner possible.
NOTE: As of version 1.1b we started PARTIALLY supporting Windows (currently under testing). All features related to PHASE MATCHING and SEARCHING seem to be working (Win10 python 3.8, Win11 python 3.8). However, refinement optimization seems to not work in Windows yet. There is no ETA to support this. We recommend still using UNIX based systems (Linux/macOS).
Xerus relies on the Materials Project API for downloading crystal structures with requested chemical space. Therefore, please register and obtain an suitable API key (Free) at: \ www.materialsproject.org \ After registering, you can check your API Key by clicking the API tab at the upper right side of the website.
Xerus relies on a MongoDB server for caching crystal structures downloaded from the providing databases (COD, AFLOW and MP) \ To install the community version (Free) and run please follow the steps listed in: * Ubuntu Installation Steps * MacOS Installation Steps * Windows Installation Steps
Xerus currently can only be installed via conda. Therefore, if you do not have conda, please follow the install instructions here. \ To install Xerus using a virtual enviroment with Python 3.8 follow this steps:
conda create -n Xerus python==3.8 anaconda
conda activate Xerus
git clone http://www.github.com/pedrobcst/Xerus/
cd Xerus
pip install -e .
:warning: You might have trouble installing pymatgen if gcc is not present in your system. You can them for example do sudo apt install g++ to install in Ubuntu, then run pip install -e . again.
bash
conda create -n Xerus python==3.8 anaconda
conda activate Xerus
pip install -e .
If all the packages installation are successful, it is needed to correctly set the configuration file at Xerus/settings/config.conf
:
password: username password (if necessary)
[mp]
[gsasii]
Xerus by default comes with an .instprm file obtained from fitting NIST Si in our XRD Machine (Rigaku MiniFlex 600). It is recommended that you follow the GSASII tutorial for obtaining an .instrprm for your own machine but probably not necessary.
testxrd:
If all the above steps were done sucessfuly (pip install, mongo running [locally] and API key set),
please do the following:
bash
cd tests
pytest -vvv
If all tests sucessfuly pass, Xerus should be ready for use.
:warning: This process might take a while.
:warning: Make sure that before running the examples, you have started the MongoDB server.
As of release 1.1b we are providing a beta Streamlit interface that can help you interactively use XERUS (and its features). Altough not as flexible as using through Jupyter, it can provide a zero code alternative (or even be hosted in a main server where other users can directly use from their browser)
To start it, after installation do:
python
streamlit run app/app.py
If you use Xerus please also cite the following papers:
Toby, B. H., & Von Dreele, R. B. (2013). "GSAS-II: the genesis of a modern open-source all purpose crystallography software package". Journal of Applied Crystallography, 46(2), 544-549. doi:10.1107/S0021889813003531
If you use blackbox method for refinement please also cite:
The code is licensed under an MIT license. The data used for benchmarking is licensed under CC4.0 and was taken from
Szymanski, N. J., Bartel, C. J., Zeng, Y., Tu, Q., & Ceder, G. (2021). Probabilistic Deep Learning Approach to Automate the Interpretation of Multi-phase Diffraction Spectra. Chemistry of Materials.
Currently, one of the main issues of when querying the COD is the lack of control on what structure we obtain. As discussed in the paper, one of the main issues of missclassifications is when a distorted low temperature structure (that usually comes from the COD) is matched instead of the of room temperature one.
In this situtation, one possibility to avoid this is to implement one extra filter on the OPTIMADE querier of COD (_cod_celltemp) to restrict structures around room temperature only ( maybe 293 +- 5 K ?).
This info seems to not be available on the COD REST API, therefore the Optimade querier should become the main one.
To do this, evaluate: - [ ] Is there any change on the total amounts of structures if the we use _cod_celltemp as filter? - [ ] What will be the impact on the benchmark / examples of Xerus? - [ ] In the first case, if there a lot of structures with no _cod_celltemp, an option might be of doing the filter post-query, and keeping the structures that have no celltemp
As of latest version, the "dummy" entry created into the database when no structures exist for a given element combination in any of the databases providers to avoid continuosly requerying that element combination is not being added anymore. This probably appeared after the testing method changed. Fix this.
Change the input paramaters to st.forms so we dont always reload the app.
As of PR #24, we do not do test refinements anymore. Since things became much stabler and there is no more errors that breaks and requires a totally script rerunning, the purpose of tcif.py can be moved elsewhere.
TODO: - [ ] Move testing to be a simple function that does the 'loading' internally - [ ] Remove tcif.py
As of possible new release (1.1b), we might support all OSes. In light of this, it might be necessary to update the CI to test in all oses.
This (hypothetically) might work [basically move to conda for enviroment management for CI]:
Currently, everytime the analyze
function is ran (even if the same paremeters), Xerus will re-simulate, re-query the database. This can be time consuming, and actually makes the sometimes needed iterative process of hyperparameter tuning (ie, g, delta, n_runs, provider settings and so on) time consuming. In light of this problem, the following changes are needed:
This release brings an update to both pymatgen and updates the Materials Project to use it latest API, which will be the database going forward that will receive continuous updates from the MP team.
This is a pre-release version of v1.1b that introduces partiall support to Windows OS, and add the Beta streamlit interface
Mostly add GSASII windows binaries that seems to work straight out of the box.
Phase identification (main purpose of Xerus) is working. Optimization is not working and that might not be supported. (Windows issues with multiprocessing..?)
Also, officialy adds the streamlit BETA interface. This interface allows to manipulate Xerus through an easy to use interface. Not as flexible as jupyter, but it provides as option to host it in a server for users to use it directly through browser.
This release changes how the querying works. For already existing users, it will be necessary to make the 'id' field of your mongo unique manually.
Supposedly, it should allow for concurrent users to use the same Xerus installation, and this is mostly to provide support for the possible scenario of one server running the Streamlit interface (to be finished soon), and many researchers analyzing their data on this interface conconrruently possible analyzing data with overlapping chemical spaces.
Also, makes testing faster.
Please see #26 for detailed changes.
Full Changelog: https://github.com/pedrobcst/Xerus/compare/v1.1...v1.1a
This is the release of v1.1 of Xerus. Few things have changed most notably: (Thanks to ml-evs) - Added self pip installation, allowing the path hacks to be removed and the package be used from anywhere - Started adding CI - Added a way to load old search results - Added support to structure query through OPTIMADE (which is default for OQMD now) - Restricted structure volume size coming from COD - Phased out AFLOW
Changed the way we treat peak positions changes from lattice constants + zero-point error to lattice constants + sample displacement. Update the benchmarks and examples to reflect new change.
Released first open version
phd student @ University of Tsukuba / National Institute for Materials Science (NIMS)
GitHub Repositoryxray-diffraction materials-informatics materials-science