Directord is a powerful automation platform and protocol built to drive infrastructure and applications across the physical, edge, IoT, and cloud boundaries; efficient, pseudo-real-time, at scale, made simple.

oshied, updated 🕥 2022-05-06 14:06:13


Directord is a powerful automation platform and protocol built to drive infrastructure and applications across the physical, edge, IoT, and cloud boundaries; efficient, pseudo-real-time, at scale, made simple.

Design Principles

The Directord design principles can be found here.


Additional documentation covering everything from application design, wire diagrams, installation, usage, and more can all be found here.

Welcome Contributors

  • Read documentation on how best to deploy and leverage directord.

  • When ready, if you'd like to contribute to Directord pull-requests are very welcomed. Directord is an open platform built for operators. If you see something broken, please feel free to raise a bug and/or fix it.

  • Information on running tests can be found here.

Have Questions?

Join us on at #directord. The community is just getting started; folks are here to help, answer questions, and support one another.

Quick Introduction

This quick cast shows how easy it is to install, bootstrap, and deploy a scale test environment.


Hello World

Let's create a virtual env on your local machine to bootstrap the installation, once installed you can move to the server node and call all your tasks from there

shell $ python3 -m venv --system-site-packages ~/directord $ ~/directord/bin/pip install --upgrade pip setuptools wheel $ ~/directord/bin/pip install directord

We need to create a catalog for bootstrapping. Let's assume we are installing directord in two machines:

  • directord-1 : directord server, a client

  • directord-2 : Only a client

For that we create a file

shell $ vi ~/directord-catalog.yaml

with the contents

``` yaml directord_server: targets: - host: port: 22 username: fedora

directord_clients: args: port: 22 username: fedora targets: - host: - host: ```

We can now call directord to bootstrap the installation. Bootstrapping uses ssh to connect to the machines but after that ssh is no longer used. and you only need the ssh keys to connect your local machine to the machines you are installing into the server and client do not need shared keys between themselves.

To kickstart the bootstrapping you call directord with the catalog file you created and a catalog with the jobs required to bootstrap them.

shell $ ~/directord/bin/directord bootstrap \ --catalog ~/directord-catalog.yaml \ --catalog ~/directord/share/directord/tools/directord-dev-bootstrap-zmq.yaml Once that is ran you can now ssh to the server and issue all the commands from there

shell $ ssh [email protected]

First to make sure all the nodes are connected

shell $ sudo /opt/directord/bin/directord manage --list-nodes

Should show you


directord-1 132.2 0.9.0 1:38:53.240000 0:00:00.051849 directord-2 131.69 0.9.0 1:39:25.780000 0:00:00.099533 ```

Then we create our first orchestration job lets add a file called

shell $ vi helloworld.yaml

With the contents

yaml - jobs: - ECHO: hello world

Then we call the orchestration to use it

shell $ sudo /opt/directord/bin/directord orchestrate helloworld.yaml

Should return something like:

shell Job received. Task ID: 9bcf31cb-7faf-4367-bf37-57c11b3f81dc

We use that task ID to probe how the job went or we can list all the jobs with"

shell $ sudo /opt/directord/bin/directord manage --list-jobs

That returns something like:


9bcf31cb-7faf-4367-bf37-57c11b3f81dc 9bcf31cb-7faf-4367-bf37-57c11b3f81dc 0.02 2 0 ```

With the task id we can see how the job went:

shell $ sudo /opt/directord/bin/directord manage --job-info 9bcf31cb-7faf-4367-bf37-57c11b3f81dc

And voila here is our first orchestrated hello world:

``` shell KEY VALUE

ID 9bcf31cb-7faf-4367-bf37-57c11b3f81dc INFO test1 = hello world test2 = hello world STDOUT test1 = hello world test2 = hello world ... ```


Apache License Version 2.0 COPY


Directord 0.12.0 2022-01-06 14:32:32

Hot on the heals of 0.11.3 a new major has dropped! The default driver is now gRPC. While there's nothing wrong with the ZMQ implementation and we'll be supporting it for the foreseeable future, gRPC allows Directord to be used in FIPS certified environments; this is not possible with ZMQ due to it's use of libsodium. The 0.12.0 release provides the means to run in secure clouds by default, while maintaining the performance and speed we've grown to expect.


What's Changed

  • Rev 0120 by @cloudnull in
  • Add exclude option to DNF component by @sshnaidm in
  • Spec file and systemd packaging updates by @slagle in
  • Allow drivers to run in isolation by @cloudnull in
  • allow the driver to run in dummy mode by @cloudnull in

Full Changelog:

Directord 0.11.3 2022-01-04 17:14:45

A new era for Directord: new capabilities, new components, new functions, new classes; just a better tool.


The changes included in this release of Directord are staggering. It really should be a major version however, we're keeping that for a little bit later, largely because I forgot to rev things. Internally just about everything has been improved. From a more robust process/thread model and better isolation, to a whole new driver capability. All of these changes come without cost to performance and stability, in-fact we've improved performance by about 5% over the last release.

This release also comes with some assurances to our claimed scale expectations. While we documented our systems and processes on and covered the internals, setup, and expected performance on YouTube, we've now scale tested Directord at 150 nodes and the results were incredible. The Basic Task-Core POC applied to 150 nodes took ~11 minutes to complete while using both the ZMQ and GRPC drivers. The Messaging driver accomplished the same task in 18 minutes. In contrast, our legacy deployment tooling took 45 minutes to do the same work.

So with all that said, checkout the release notes; there's SO much going on. The team is growing, we're adding contributors, and the project is making some incredible moves.

What's Changed

  • Add exposed message ID to heartbeats by @cloudnull in
  • Prepare dev-setup for CentOS 9 by @sshnaidm in
  • cleanup default dev catalog by @cloudnull in
  • Add option to cache STDERR to RUN component by @sshnaidm in
  • Add hostname to Containerfile for tests as it's in Dockerfile by @sshnaidm in
  • Add identity override to config by @mwhahaha in
  • Update docs for RUN component by @sshnaidm in
  • Add option to name orchestrations and jobs by @cloudnull in
  • Allow to set debug from environment variable by @sshnaidm in
  • Use --best in DNF component for install or update by @sshnaidm in
  • Fix issues in components by @sshnaidm in
  • Add option to allow orchestrations to override targets by @cloudnull in
  • Fix messaging bootstrap for multiple nodes by @slagle in
  • Migrate to directord organization by @kajinamit in
  • Remove unnecessary characters by @kajinamit in
  • Change --server-address to --zmq-server-address for container and docs by @sshnaidm in
  • Remove the diskcache dep by @cloudnull in
  • Add option to allow operators to set the machine id by @cloudnull in
  • Add several small changes to tune scale testing by @cloudnull in
  • remove extra print by @cloudnull in
  • Add CONTAINER_IMAGE component to work with podman images by @sshnaidm in
  • Fix TLS verify for all podman code by @sshnaidm in
  • Run full functional tests for CONTAINER_IMAGE component by @sshnaidm in
  • Make stdout and stderr args available for any component by @sshnaidm in
  • Connect to client to get hostname by @slagle in
  • ensure cacheargs is used in all components by @cloudnull in
  • fix node pruning by @cloudnull in
  • Job interaction improvements by @cloudnull in
  • updating timings by @cloudnull in
  • add poller to client job results by @cloudnull in
  • Update disc store to be POSIX compliant by @cloudnull in
  • add functional test for posix datastore by @cloudnull in
  • add bootstrap to the Directord library implementation by @cloudnull in
  • additional updates for POSIX cache types by @cloudnull in
  • Updated docs by @cloudnull in
  • fix bootstrap server targets by @cloudnull in
  • Add orch file for provisioning clients only by @sshnaidm in
  • Fix issue when no jobs for target by @sshnaidm in
  • update file store by @cloudnull in
  • add exception handling for bootstrap by @cloudnull in
  • add more exception handling by @cloudnull in
  • Re-work query to use coordination instead of client side callbacks by @cloudnull in
  • Rev0113 by @cloudnull in
  • add additional error handling for query call backs by @cloudnull in
  • add prod-bootstrap and blueprint to query wait by @cloudnull in
  • update readme by @cloudnull in
  • add cache read lock by @cloudnull in
  • Fix wait option handling by @mwhahaha in
  • Fix status code check by @mwhahaha in
  • Increase default wait retries by @mwhahaha in
  • add retry decorator to components by @cloudnull in
  • update machine checking and messaging workers by @cloudnull in
  • gRPC driver by @mwhahaha in
  • Fixes for grpcd backend by @mwhahaha in
  • Add request id to grpc requests and responses by @mwhahaha in
  • use threading instead of multiprocessing by @cloudnull in
  • add grpc gate test by @cloudnull in
  • Add coroutine timeout decorator by @cloudnull in
  • bootstrap requires the use of multiprocessing by @cloudnull in
  • ensure that drivers use process based locks by @cloudnull in
  • Ensure components have unique locks by @cloudnull in
  • remove coroutine timeout by @cloudnull in
  • Reduce the debug logging for grpcd by @mwhahaha in
  • Ensure events are driver specific by @cloudnull in
  • Grpc increase wait and enable compression by @mwhahaha in
  • reimplement timeout coroutine by @cloudnull in
  • Fix disable compression default by @mwhahaha in
  • Remove messaging drivers entrypoint by @slagle in
  • Wire up ssl support for grpc by @mwhahaha in
  • Create thread exception class and terminate events by @cloudnull in
  • Increase file limits for the server by @mwhahaha in
  • Only create a single client instance by @mwhahaha in
  • Add durable queue type option for clients by @cloudnull in
  • Add exception handling to client execution by @cloudnull in
  • Add C++ compiler for grpcio deps build by @sshnaidm in
  • Skip client close on job close by @mwhahaha in
  • Revert "Add durable queue type option for clients" by @cloudnull in
  • Add grpc scripts to packaging by @mwhahaha in
  • Packaging updates by @slagle in
  • Cover grpc driver with tests by @mwhahaha in
  • Fix query information part by @sshnaidm in
  • Add facter component for collections of facts on the node by @sshnaidm in
  • Add --reloaded to service component by @mwhahaha in
  • fix queue purge by @cloudnull in
  • use pre-fork signals to allow exit by @cloudnull in
  • Cleanup tests to remove the crazy output by @cloudnull in
  • Packaging updates by @slagle in
  • add trap for driver load errors by @cloudnull in
  • move key generation to the zmq driver by @cloudnull in
  • Add support for multiple words for echo by @sshnaidm in
  • Replace the cache class with iodict by @cloudnull in
  • DurableQueue by @cloudnull in
  • move flushqueue to library by @cloudnull in
  • Move the worker items into a object by @cloudnull in
  • Add server side logic to mark nodes active by @cloudnull in
  • Tuneup by @cloudnull in
  • reintegrate iodict by @cloudnull in
  • move signals to the process interface by @cloudnull in
  • Drop server side query timeout by @mwhahaha in
  • remove extra verbose log lines from components by @cloudnull in
  • Use individual args for bootstrap interface by @slagle in
  • Use sudo in rpm and config bootstrap by @slagle in
  • Add bootstraps to manage/unmanage the cluster by @slagle in
  • Update drivers doc for grpcd by @mwhahaha in
  • Minor updates to grpc ssl docs by @mwhahaha in
  • Fix help message for service by @sshnaidm in
  • Add mask/unmask option to service component by @sshnaidm in
  • add updated grpcd diagram and driver status info by @cloudnull in

New Contributors

  • @kajinamit made their first contribution in

Full Changelog:

Directord 0.11.0 2021-10-13 22:22:50

Release 11, feature packed, cleaner, a new driver, and is lighter than ever before.



This release introduces the new oslo-messaging driver, allowing Directord to operate in a traditional AMQP environment. This change is crucial to our success as we want to empower operators to leverage Directord in their existing environments, without needing to augment or change platforms. If operators have a messaging backend supported by OSLO-Messaging, Directord can make use of it today.

This release also cleans up a lot of the Directord legacy encoding. Before encoding was done throughout the code-base, now encoding is all done within the driver. This means the functional code within Directord is far more simple, better documented, and easier to understand.

TripleO PTG

Directord is being discuessed as part of the TripleO Yoga PTG. Checkout the PTG notes and sessions for more.

Slides from the Directord Overview PTG session can be seen within the PDF attached to this release here.

e7d8015 Add generic wait component 8d775ab Fix typo in README ae023be Dyn drivers 2 cd59fcd add dynamic driver parsing to the help output e0838d1 Update the dynamic driver parser 95f43ae Add SSL support for messaging driver 4855b87 add easy local doc generation and browsing a24701f add job definitions to the bootstrap process 1d7fde5 update data-store options and documentation 5cd1f33 Update push.yml 99cb2fe Add credit loop to pollers 402ea40 Add message driver analysis 06a1d93 Create CNAME 0c12d8d Delete CNAME 9e3a069 Change the job processor to prioritize messages cd7a267 add link 73df12d reformat 1ddda11 more doc updates 4abe45e add setup section 737bece add updated overview 5318e70 Driver docs 2354af5 Added driver-messaging.drawio.png 9ea6934 add bootstrap catalog for the messaging driver 5c8f8d7 fix bug 235 3b6c47e add missing abstract methods from messaging ce71c9e add flake8 docstring tests b144793 Messaging thread cleanup and job support 4a23912 Fix messaging heartbeat 8016e6d UX imporovements 09d696a updated diagram and docs acf6985 Added highlevel-messaging.png a9b68be Add driver_run in it's own process 6c3e2af add hostname fencing 9910714 add starting documentation for messaging and tweak the driver 3f3d053 more updates to support our simplified encoding process 49a886c Make CLI args override config file e529792 Driver api 45ed933 Degrated -> Degraded d8edb42 Update the messaging abstractions e7fbc04 rev 0.10.1 190372f Add support for oslo-messaging as a driver

What's Changed

  • Add support for oslo-messaging as a driver by @slagle in
  • rev 0.10.1 by @cloudnull in
  • Update the messaging abstractions by @cloudnull in
  • Degrated -> Degraded by @slagle in
  • Driver api by @cloudnull in
  • Make CLI args override config file by @slagle in
  • more updates to support our simplified encoding process by @cloudnull in
  • add starting documentation for messaging and tweak the driver by @cloudnull in
  • add hostname fencing by @cloudnull in
  • Add driver_run in it's own process by @slagle in
  • updated diagram and docs by @cloudnull in
  • UX imporovements by @cloudnull in
  • Fix messaging heartbeat by @slagle in
  • Messaging thread cleanup and job support by @slagle in
  • D102 updates by @cloudnull in
  • add missing abstract methods from messaging by @cloudnull in
  • fix bug 235 by @cloudnull in
  • add bootstrap catalog for the messaging driver by @cloudnull in
  • Driver docs by @cloudnull in
  • analysis documentation updates by @cloudnull in
  • add setup section by @cloudnull in
  • more doc updates by @cloudnull in
  • reformat by @cloudnull in
  • add link by @cloudnull in
  • Change the job processor to prioritize messages by @cloudnull in
  • Add message driver analysis by @cloudnull in
  • Add credit loop to pollers by @cloudnull in
  • update data-store options and documentation by @cloudnull in
  • add job definitions to the bootstrap process by @cloudnull in
  • add easy local doc generation and browsing by @cloudnull in
  • Add SSL support for messaging driver by @slagle in
  • Update the dynamic driver parser by @cloudnull in
  • add dynamic driver parsing to the help output by @cloudnull in
  • Dyn drivers 2 by @cloudnull in
  • Fix typo in README by @slagle in
  • Add generic wait component by @mwhahaha in

Full Changelog:

Directord 0.10.0 2021-09-23 17:59:31

The 0.10.0 release is the most significant Directord release since starting the project. Over this last development cycle, we've focused on use-cases and feedback from operators who are deploying complex applications. We've had an ongoing goal of pseudo-real-time execution, which scales horizontally. While more improvements are to be made in future releases, Directord is now close to the original goal of pseudo-real-time performance in both practice and test.

Highlights From This Development Cycle

  • Directord is now faster than ever, approaching pseudo-real-time execution with a minimal memory footprint.
  • The client and server codebase has been massively simplified.
  • New in this release is the ability to do client-side coordination, allowing operators to craft complex components and build out job assurances that have intra-client dependencies.
  • An example of coordination can be seen in the JOB_WAIT component.
  • New data integrity checks have been added for file transfer operations.
  • The ADD and COPY component has been re-written.
  • No longer does Directord require the backend socket remain open while the client is running.
  • The client will connect back to the server over the backend socket only when needed.
  • The heartbeat socket and thread have been removed. While Directord still uses heartbeats, the messages travel over the one job socket.
  • This clean-up removed two PIDs and vast chunks of code.
  • The client and server will now fork when needing to ingest or run jobs.
  • This change better ensures applications efficiency and minimizes resource consumption. While resource consumption was already low, it is now even lower.
  • The client will now use dynamic command-based locking, which only resides in memory for as long as there are jobs to process.
  • Before, Directord employed a global lock when required, now components make use of their named lock object, which further improves the speed of component execution. The speed improvements from the component locking changes are even more pronounced when leveraging async orchestrations.
  • The management function now provides an analysis tool, which will allow operators to analyze jobs and parents.
  • This is useful for determining node outliers, runtime issues, and other fun facts.
  • The command line orchestrate and exec functions now have a --stream option which will stream STDOUT/STDERR/INFO as it becomes available during execution.

While these highlights are excellent, there's a lot improved in Directord that was not mentioned, and more yet to come.


2e38bc9 add analysis function 1ff14ab remove heartbeat methods that no longer serve any purpose 26896ad cleanup management function f8ce379 ensure efficient cleanup of dynamic locks abb4ac5 Add {posargs} to tox coverage command 832acac add dynamic command based locking 386fce5 rollback dynamic locking 0466570 move callback processing to ensure multi-return for specific nodes is right 1730909 add debug to lock creation 6e87a2f ensure that the processing state is set correctly 5fbbfc9 allow commands to run with the global lock when force-lock is true 4d469ce add command type locking ac9cabe allow async workers to run with the current cpu count d763fc3 remove additional counts in favor of timing fe18346 use multiple returns when running a callback 41dc5f1 use timing instead of loop counts 1bd5f17 move return notice to the end of the execution 5d10054 slow down the query_wait log warning 409cb64 fix minor issues with documentation 1cf2df4 Improve job wait and target coordination 85f1c16 when waiting on callbacks, just block on the last one fcb7a3e use 1 second delays where possible 98b3cd8 update JOB_WAIT to use new relay 8ca362b add coordination relay aa83856 add identity list to QUERY callback 3fff576 re-update the queue processor fcb9610 finish moving transfer to backend 698e270 add job-wait coordination a05074b Revert "Improve client processor" 89651f3 Fix coordination issues 009f7c4 move the transfer bind to a backend bind 8ba9804 Remove the use of the server side heartbeat socket 62bb591 add delay as an Event property 3f5da26 Improve client processor a995e78 remove the healthcheck thread 8606f5b Add identity checks for query wait ae6e3cf Add functional testing and improve process management 043ff6b Enhance our usage of dynamic threads and high watermark monitoring 4a37d12 Save a reference to zmq Driver, and restore it for each unit test 5a420e3 add bypass manager set 58cc064 Stream and callback improvements 202cad0 Server return and async tracing 67132a4 add timestamps to parent pruning

Directord 0.9.4 2021-09-14 02:55:13

Further improves efficiency across both the client and server, and resolves a couple regressions caused by the new thread model.


99afa2e Correct target handling when running through the library b2d4df9 Srv q client 2fd6a1e Increase client efficiency 4995647 add queue processing to the server

Directord 0.9.2 2021-09-10 20:57:22

Faster client and server interactions and a promoted datastore option allowing Directord to run faster with even fewer resources.



The new disc datastore, now default in Directord, allows Directord to resume operations faster from a stopped state, it also allows Directord store jobs persistently without consuming system memory or requiring a remote datastore. Another incredible feature of the new datastore is speed. Directord is now able to store, reference, and recall information faster than ever before; this is especially true when dealing with tens of thousands of jobs and orchestrations.


63ffc6d remove artificial time blocks 9cc55b5 Add disc backed data caching for the server


Evolving the way we build infrastructure

GitHub Repository

automation deployment deployment-automation devops devops-tools system-administration system execution infrastructure cloud cloud-automation edge edge-computing configuration configuration-management directord iot python web3 web3py