Configuring Experience Optimization for high availability

There are a number of ways the Experience Optimization environment can be made highly available, as well as a number of restrictions.

Single Availability Zone/Datacenter

Both the Management Service (xo-management) and Query Service (xo-query service) can individually be made highly available by load balancing two or more services. The only requirement is that they all point to the same logical OpenSearch instance or cluster.

  • We discourage distributing OpenSearch clusters across multiple data centers or Availability Zones. See: https://www.elastic.co/blog/clustering_across_multiple_data_centers.
  • On AWS, provided multiple Availability Zones in one region are used, it is not a problem to distribute a cluster.
  • For an Active/Passive scenario, it is sufficient to fully replicate the relevant indexes of the primary OpenSearch cluster.

Active/Passive

You can sync an active OpenSearch cluster to a passive cluster in a second data center. Snapshot replication is possible through the xref snapshot and restore mechanism.

Active/Active:

Although a real-time, end-to-end active/active infrastructure is not 100% possible, it is possible to do in a way that should meet high-availability requirements. Given that spreading an OpenSearch cluster over multiple data centers is highly discouraged, some way of synchronizing data between data centers and the relevant Experience Optimization indexes in each separate OpenSearch cluster is needed.

In a Tridion Sites stack, achieve an Active/"Sort of Active" high-availability scenario by using multiple Destinations to send Experience Optimization Content Fragments to two or more separate indexes.

It is not possible to perform CRUD on Promotions, Experiments, as well as Trigger Type and Region configuration for two or more Elasticsearch indexes simultaneously. This means that:

  • Some custom synchronisation needs to be in place for Experiments and Promotions in order to get them from the OpenSearch index which is connected to the Active xo-management service to the second index, acting as replica for these specific Experience Optimization types.
  • Trigger Types and Regions need to be configured twice (or more, depending on the number of clusters) through the command line tools.