Monday, 10 October 2016

TripleO composable/custom roles

This is a follow-up to my previous post outlining the new composable services interfaces , which covered the basics of the new for Newton composable services model.

The final piece of the composability model we've been developing this cycle is the ability to deploy user-defined custom roles, in addition to (or even instead of) the built in TripleO roles (where a role is a group of servers, e.g "Controller", which runs some combination of services).

What follows is an overview of this new functionality, the primary interfaces, and some usage examples and a summary of future planned work.



Fully Composable/Custom Roles

As described in previous posts TripleO has for a long time provided a fixed architecture with 5 roles (where "roles" means groups of nodes) e.g Controller, Compute, BlockStorage, CephStorage and ObjectStorage.

This architecture has been sufficient to enable standardized deployments, but it's not very flexible.  With the addition of the composable-services model, moving services around between these roles becomes much easier, but many operators want to go further, and have full control of service placement on any arbitrary roles.

Now that the custom-roles feature has been implemented, this is possible, and operators can define arbitrary role types to enable fully composable deployments. When combined with composable services represents a huge step forward for TripleO flexibility! :)

Usage examples

To deploy with additional custom roles (or to remove/rename the default roles), a new interface has been added to the python-tripleoclient “overcloud deploy interface”, so you simply need to copy the default roles_data.yaml, modify to suit your requirements (for example by moving services between roles, or adding a new role), then do a deployment referencing the modified roles_data.yaml file:

cp /usr/share/openstack-tripleo-heat-templates/roles_data.yaml my_roles_data.yaml
<modify my_roles_data.yaml>
openstack overcloud deploy –templates -r my_roles_data.yaml


Alternatively you can copy the entire tripleo-heat-templates tree (or use a git checkout):

cp -r /usr/share/openstack-tripleo-heat-templates my-tripleo-heat-templates
<modify my-tripleo-heat-templates/roles_data.yaml>
openstack overcloud deploy –templates my-tripleo-heat-templates


Both approaches are essentially equivalent, the -r option simply overwrites the default roles_data.yaml during creation of the plan data (stored in swift on the undercloud), but it's slightly more convenient if you want to use the default packaged tripleo-heat-templates instead of constantly rebasing a copied tree.

So, lets say you wanted to deploy one additional node, only running the OS::TripleO::Ntp composable service, you'd copy roles_data.yaml, and append a list entry like this:

- name: NtpRole
  CountDefault: 1
  ServicesDefault:
    - OS::TripleO::Services::Ntp



(Note that in practice you'll probably also want some of the common services deployed on all roles, such as OS::TripleO::Services::Kernel, OS::TripleO::Services::TripleoPackages, OS::TripleO::Services::TripleoFirewall and OS::TripleO::Services::VipHosts)

 

Nice, so how does it work?


The main change made to enable custom roles is a pre-deployment templating step which runs Jinja2. We define a roles_data.yaml file(which can be overridden by the user), which contains a list of role names, and optionally some additional data related to default parameter values (such as the default services deployed on the role, and default count in the group)

The roles_data.yaml definitions look like this:

- name: Controller
CountDefault: 1
ServicesDefault:
  - OS::TripleO::Services::CACerts
  - OS::TripleO::Services::CephMon
    - OS::TripleO::Services::CinderApi
    - ...

The format is simply a yaml list of maps, with a mandatory “name” key in each map, and a number of optional FooDefault keys which set the parameter defaults for the role (as a convenience so the user won't have to specify it via an environment file during the overcloud deployment).

A custom mistral action is used to run Jinja2 when creating or updating a “deployment plan” (which is a combination of some heat templates stored in swift, and a mistral environment containing user parameters) – and this basically consumes the roles_data.yaml list of required roles, and outputs a rendered tree of Heat templates ready to deploy your overcloud.
Custom Roles, overview


There are two types of Jinja2 templates which are rendered differently, distinguished by the file extension/suffix:

foo.j2.yaml

This will pass in the contents of the roles_data.yaml list, and iterate over each role in the list, The resulting file in the plan swift container will be named foo.yaml.
Here's an example of the syntax used for j2 templating inside these files:

enabled_services:
list_join:
   - ','
{% for role in roles %}
   - {get_attr: [{{role.name}}ServiceChain, role_data, service_names]}
{% endfor %}

This example is from overcloud.j2.yaml, it does a jinja2 loop appending service_names for all roles *ServiceChain resources (which are also dynamically generated via a similar loop), which is then processed on deployment via a heat list_join function,

foo.role.j2.yaml

This will generate a file per-role, where only the name of the role is passed in during the templating step, with the resulting files being called rolename-foo.yaml. (Note that If you have a role which requires a special template, it is possible to disable this file generation by adding the path to the j2_excludes.yaml file)

Here's an example of the syntax used in these files (taken from the role.role.j2.yaml file, which is our new definition of server for a generic role):

resources:
{{role}}:
type: OS::TripleO::Server
metadata:
os-collect-config:
command: {get_param: ConfigCommand}
properties:
image: {get_param: {{role}}Image}

As you can see, this simply allows use of a {{role}} placeholder, which is then substituted with the role name when rendering each file (one file per role defined in the roles_data.yaml list).


Debugging/Development tips

When making changes to either the roles_data.yaml, and particularly when making changes to the *.j2.yaml files in tripleo-heat-templates, it's often helpful to view the rendered templates before any overcloud deployment is attempted.

This is possible via use of the “openstack overcloud plan create” interface (which doesn't yet support the -r option above, so you have to copy or git clone the tree), combined with swiftclient:

openstack overcloud plan create overcloud –templates my_tripleo_heat_templates
mkdir tmp_templates && pushd tmp_templates
swift download overcloud

This will download the full tree of rendered files from the swift container (named “overcloud” due to the name passed to plan create), so you can e.g view the rendered overcloud.yaml that's generated by combining the overcloud.j2.yaml template with the roles_data.yaml file.

If you make a mistake in your *.j2.yaml file, the jinja2 error should be returned via the plan create command, but it can also be useful to tail -f /var/log/mistral/mistral-server.log for additional information during development (this shows the output logged from running jinja2 via the custom mistral action plugin).

Limitations/future work

These new interfaces allow for much greater deployment flexibility and choice, but there are a few remaining issues which will be addressed in future development cycles:
  1. All services managed by pacemaker are still tied to the Controller role. Thanks to the implementation of a more lightweight HA architecture during the Newton cycle, the list of services managed by pacemaker is considerably reduced, but there's still a number of services (DB & RPC services primarily) which are, and until the composable-ha blueprint is completed (hopefully during Ocata), these services cannot be moved to a non Controller role.
  2. Custom isolated networks cannot be defined. Since arbitrary roles types can now be defined, there may be a requirement to define arbitrary additional networks for network-isolation, but right now this is not possible.
  3. roles_data.yaml must be copied. As in the examples above, it's necessary to copy either roles_data.yaml, (or the entire tripleo-heat-templates tree), which means if the packaged roles_data.yaml changes (such as to add new services to the built-in roles), you must merge these changes in with your custom roles_data. In future we may add a convenience interface which makes it easier to e.g add a new role without having to care about the default role definitions.
  4. No model for dependencies between services.  Currently ensuring the right combination of services is deployed on specific roles is left to the operator, there's no validation of incompatible or inter-dependent services, but this may be addressed in a future release.

1 comment:

  1. There's a typo (or maybe rendering issue) in the 'openstack overcloud plan create' command line, above, which shows '–templates' instead of '--templates'.

    ReplyDelete