Containerized deployments, debugging basicsSince the Pike release, TripleO has supported deployments with OpenStack services running in containers. Currently we use docker to run images based on those maintained by the Kolla project.
We already have some tips and tricks for container deployment debugging in tripleo-docs, but below are some more notes on my typical debug workflows.
Config generation debugging overviewIn the TripleO container architecture, we still use Puppet to generate configuration files and do some bootstrapping, but it is run (inside a container) via a script docker-puppet.py
The config generation usage happens at the start of the deployment (step 1) and the configuration files are generated for all services (regardless of which step they are started in).
The input file used is /var/lib/docker-puppet/docker-puppet.json, but you can also filter this (e.g via cut/paste or jq as shown below) to enable debugging for specific services - this is helpful when you need to iterate on debugging a config generation issue for just one service.
Then we can run the config generation, if necessary changing the tags (or puppet modules, which are consumed from the host filesystem e.g /etc/puppet/modules) until the desired output is achieved:
When the config generation is completed, configuration files are written out to /var/lib/config-data/heat.
We then compare timestamps against the /var/lib/config-data/heat/heat.*origin_of_time file (touched for each service before we run the config-generating containers), so that only those files modified or created by puppet are copied to /var/lib/config-data/puppet-generated/heat.
Note that we also calculate a checksum for each service (see /var/lib/config-data/puppet-generated/*.md5sum), which means we can detect when the configuration changes - when this happens we need paunch to restart the containers, even though the image did not change.
This checksum is added to the /var/lib/tripleo-config/hashed-docker-container-startup-config-step_*.json files by docker-puppet.py, and these files are later used by paunch to decide if a container should be restarted (see below).
Runtime debugging, paunch 101Paunch is a tool that orchestrates launching containers for each step, and performing any bootstrapping tasks not handled via docker-puppet.py.
It accepts a json format, which are the /var/lib/tripleo-config/docker-container-startup-config-step_*.json files that are created based on the enabled services (the content is directly derived from the service templates in tripleo-heat-templates)
These json files are then modified via docker-puppet.py (as mentioned above) to add a TRIPLEO_CONFIG_HASH value to the container environment - these modified files are written with a different name, see /var/lib/tripleo-config/hashed-docker-container-startup-config-step_*.json
Note this environment variable isn't used by the container directly, it is used as a salt to trigger restarting containers when the configuration files in the mounted config volumes have changed.
As in the docker-puppet case it's possible to filter the json file with jq and debug e.g mounted volumes or other configuration changes directly.
It's also possible to test configuration changes by manually modifying /var/lib/config-data/puppet-generated/ then either restarting the container via docker restart, or by modifying TRIPLEO_CONFIG_HASH then re-running paunch.
Note paunch will kill any containers tagged for a particular step e.g the --config-id tripleo_step4 --managed-by tripleo-Controller means all containers started during this step for any previous paunch apply will be killed if they are removed from your json during testing. This is a feature which enables changes to the enabled services on update to your overcloud but it's worth bearing in mind when testing as described here.
Containerized services, loggingThere are a couple of ways to access the container logs:
- On the host filesystem, the container logs are persisted under /var/log/containers/<service>
- docker logs <container id or name>
Debugging containers directlySometimes logs are not enough to debug problems, and in this case you must interact with the container directly to diagnose the issue.
When a container is not restarting, you can attach a shell to the running container via docker exec:
That's all for today, for more information please refer to tripleo-docs,, or feel free to ask questions in #tripleo on Freenode!