Monday, 4 June 2018

TripleO Containerized deployments, debugging basics

Containerized deployments, debugging basics

Since the Pike release, TripleO has supported deployments with OpenStack services running in containers.  Currently we use docker to run images based on those maintained by the Kolla project.

We already have some tips and tricks for container deployment debugging in tripleo-docs, but below are some more notes on my typical debug workflows.

Config generation debugging overview

In the TripleO container architecture, we still use Puppet to generate configuration files and do some bootstrapping, but it is run (inside a container) via a script docker-puppet.py

The config generation usage happens at the start of the deployment (step 1) and the configuration files are generated for all services (regardless of which step they are started in).

The input file used is /var/lib/docker-puppet/docker-puppet.json, but you can also filter this (e.g via cut/paste or jq as shown below) to enable debugging for specific services - this is helpful when you need to iterate on debugging a config generation issue for just one service.

[root@overcloud-controller-0 docker-puppet]# jq '[.[]|select(.config_volume | contains("heat"))]' /var/lib/docker-puppet/docker-puppet.json | tee /tmp/heat_docker_puppet.json
{
  "puppet_tags": "heat_config,file,concat,file_line",
  "config_volume": "heat_api",
  "step_config": "include ::tripleo::profile::base::heat::api\n",
  "config_image": "192.168.24.1:8787/tripleomaster/centos-binary-heat-api:current-tripleo"
}
{
  "puppet_tags": "heat_config,file,concat,file_line",
  "config_volume": "heat_api_cfn",
  "step_config": "include ::tripleo::profile::base::heat::api_cfn\n",
  "config_image": "192.168.24.1:8787/tripleomaster/centos-binary-heat-api-cfn:current-tripleo"
}
{
  "puppet_tags": "heat_config,file,concat,file_line",
  "config_volume": "heat",
  "step_config": "include ::tripleo::profile::base::heat::engine\n\ninclude ::tripleo::profile::base::database::mysql::client",
  "config_image": "192.168.24.1:8787/tripleomaster/centos-binary-heat-api:current-tripleo"
}

 

Then we can run the config generation, if necessary changing the tags (or puppet modules, which are consumed from the host filesystem e.g /etc/puppet/modules) until the desired output is achieved:


[root@overcloud-controller-0 docker-puppet]# export NET_HOST='true'
[root@overcloud-controller-0 docker-puppet]# export DEBUG='true'
[root@overcloud-controller-0 docker-puppet]# export PROCESS_COUNT=1
[root@overcloud-controller-0 docker-puppet]# export CONFIG=/tmp/heat_docker_puppet.json
[root@overcloud-controller-0 docker-puppet]# python /var/lib/docker-puppet/docker-puppet.py2018-02-09 16:13:16,978 INFO: 102305 -- Running docker-puppet
2018-02-09 16:13:16,978 DEBUG: 102305 -- CONFIG: /tmp/heat_docker_puppet.json
2018-02-09 16:13:16,978 DEBUG: 102305 -- config_volume heat_api
2018-02-09 16:13:16,978 DEBUG: 102305 -- puppet_tags heat_config,file,concat,file_line
2018-02-09 16:13:16,978 DEBUG: 102305 -- manifest include ::tripleo::profile::base::heat::api
2018-02-09 16:13:16,978 DEBUG: 102305 -- config_image 192.168.24.1:8787/tripleomaster/centos-binary-heat-api:current-tripleo
...

 

When the config generation is completed, configuration files are written out to /var/lib/config-data/heat.

We then compare timestamps against the /var/lib/config-data/heat/heat.*origin_of_time file (touched for each service before we run the config-generating containers), so that only those files modified or created by puppet are copied to /var/lib/config-data/puppet-generated/heat.

Note that we also calculate a checksum for each service (see /var/lib/config-data/puppet-generated/*.md5sum), which means we can detect when the configuration changes - when this happens we need paunch to restart the containers, even though the image did not change.

This checksum is added to the /var/lib/tripleo-config/hashed-docker-container-startup-config-step_*.json files by docker-puppet.py, and these files are later used by paunch to decide if a container should be restarted (see below).

 

Runtime debugging, paunch 101

Paunch is a tool that orchestrates launching containers for each step, and performing any bootstrapping tasks not handled via docker-puppet.py.

It accepts a json format, which are the /var/lib/tripleo-config/docker-container-startup-config-step_*.json files that are created based on the enabled services (the content is directly derived from the service templates in tripleo-heat-templates)

These json files are then modified via docker-puppet.py (as mentioned above) to add a TRIPLEO_CONFIG_HASH value to the container environment - these modified files are written with a different name, see /var/lib/tripleo-config/hashed-docker-container-startup-config-step_*.json

Note this environment variable isn't used by the container directly, it is used as a salt to trigger restarting containers when the configuration files in the mounted config volumes have changed.

As in the docker-puppet case it's possible to filter the json file with jq and debug e.g mounted volumes or other configuration changes directly.

It's also possible to test configuration changes by manually modifying /var/lib/config-data/puppet-generated/ then either restarting the container via docker restart, or by modifying TRIPLEO_CONFIG_HASH then re-running paunch.

Note paunch will kill any containers tagged for a particular step e.g the --config-id tripleo_step4 --managed-by tripleo-Controller means all containers started during this step for any previous paunch apply will be killed if they are removed from your json during testing.  This is a feature which enables changes to the enabled services on update to your overcloud but it's worth bearing in mind when testing as described here.


[root@overcloud-controller-0]# cd /var/lib/tripleo-config/
[root@overcloud-controller-0 tripleo-config]# jq '{"heat_engine": .heat_engine}' hashed-docker-container-startup-config-step_4.json | tee /tmp/heat_startup_config.json
{
  "heat_engine": {
    "healthcheck": {
      "test": "/openstack/healthcheck"
    },
    "image": "192.168.24.1:8787/tripleomaster/centos-binary-heat-engine:current-tripleo",
    "environment": [
      "KOLLA_CONFIG_STRATEGY=COPY_ALWAYS",
      "TRIPLEO_CONFIG_HASH=14617e6728f5f919b16c74f1e98d0264"
    ],
    "volumes": [
      "/etc/hosts:/etc/hosts:ro",
      "/etc/localtime:/etc/localtime:ro",
      "/etc/pki/ca-trust/extracted:/etc/pki/ca-trust/extracted:ro",
      "/etc/pki/tls/certs/ca-bundle.crt:/etc/pki/tls/certs/ca-bundle.crt:ro",
      "/etc/pki/tls/certs/ca-bundle.trust.crt:/etc/pki/tls/certs/ca-bundle.trust.crt:ro",
      "/etc/pki/tls/cert.pem:/etc/pki/tls/cert.pem:ro",
      "/dev/log:/dev/log",
      "/etc/ssh/ssh_known_hosts:/etc/ssh/ssh_known_hosts:ro",
      "/etc/puppet:/etc/puppet:ro",
      "/var/log/containers/heat:/var/log/heat",
      "/var/lib/kolla/config_files/heat_engine.json:/var/lib/kolla/config_files/config.json:ro",
      "/var/lib/config-data/puppet-generated/heat/:/var/lib/kolla/config_files/src:ro"
    ],
    "net": "host",
    "privileged": false,
    "restart": "always"
  }
}
[root@overcloud-controller-0 tripleo-config]#  paunch --debug apply --file /tmp/heat_startup_config.json --config-id tripleo_step4 --managed-by tripleo-Controller
stdout: dd60546daddd06753da445fd973e52411d0a9031c8758f4bebc6e094823a8b45

stderr: 
[root@overcloud-controller-0 tripleo-config]# docker ps | grep heat
dd60546daddd        192.168.24.1:8787/tripleomaster/centos-binary-heat-engine:current-tripleo          "kolla_start"            9 seconds ago       Up 9 seconds (health: starting)                       heat_engine

 

 

Containerized services, logging

There are a couple of ways to access the container logs:

  • On the host filesystem, the container logs are persisted under /var/log/containers/<service>
  • docker logs <container id or name>
It is also often useful to use docker inspect <container id or name> to verify the container configuration, e.g the image in use and the mounted volumes etc.

 

Debugging containers directly

Sometimes logs are not enough to debug problems, and in this case you must interact with the container directly to diagnose the issue.

When a container is not restarting, you can attach a shell to the running container via docker exec:


[root@openstack-controller-0 ~]# docker exec -ti heat_engine /bin/bash
()[heat@openstack-controller-0 /]$ ps ax
    PID TTY      STAT   TIME COMMAND
      1 ?        Ss     0:00 /usr/local/bin/dumb-init /bin/bash /usr/local/bin/kolla_start
      5 ?        Ss     1:50 /usr/bin/python /usr/bin/heat-engine --config-file /usr/share/heat/heat-dist.conf --config-file /etc/heat/heat
     25 ?        S      3:05 /usr/bin/python /usr/bin/heat-engine --config-file /usr/share/heat/heat-dist.conf --config-file /etc/heat/heat
     26 ?        S      3:06 /usr/bin/python /usr/bin/heat-engine --config-file /usr/share/heat/heat-dist.conf --config-file /etc/heat/heat
     27 ?        S      3:06 /usr/bin/python /usr/bin/heat-engine --config-file /usr/share/heat/heat-dist.conf --config-file /etc/heat/heat
     28 ?        S      3:05 /usr/bin/python /usr/bin/heat-engine --config-file /usr/share/heat/heat-dist.conf --config-file /etc/heat/heat
   2936 ?        Ss     0:00 /bin/bash
   2946 ?        R+     0:00 ps ax

 

That's all for today, for more information please refer to tripleo-docs,, or feel free to ask questions in #tripleo on Freenode!