Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • The replication script is written with in Fabric. http://www.fabfile.org/
  • Files can be replicated using Rsync via SSH or by a storage vendor specific method.
  • MySQL databases are exported using mysqldump cli command and restored using the mysql cli command

...

The replication is triggered and run from a single host. By default this is This usually the primary app server on the production environment . But this could be done also but can also run from a dedicated host, not part of any of the two clusters.
For added resilience the script and configuration is deployed to all Squirro nodes, but only actively run on the leader node.

...

Stage 3: MySQL Backup

...

  • The cluster filesystem used by the Squirro cluster is replicated to the shared NFS mount using rsync. (incremental)
  • Optional: Also replicate all All (or some) configuration files are synced to the NFS mount. This is ideal recommended if both Production and BCP Cluster are setup identical.

...

With all data stored on the NFS mount, the contents of the entire mount are replicated to the BCP datacenter.
This can be done using Rsync via SSH or using a storage vendor related replication technology (e.g. Netapp SnapMirror)

 While during the initial replication the volume can be big, subsequent replication runs should be small since with the exception of the MySQL export all methods are incremental. The higher the replication frequency, the lower the replicated data volume should be.

Stage 6: Elasticsearch Snapshot Restore

From the BCP NFS mount, the latest Elasticsearch Snapshot is restored into the ES cluster using the official Snapshot module. See: https://www.elastic.co/guide/en/elasticsearch/reference/current/modules-snapshots.html
During the restore ES will not serve traffic.. There will be a service interuption during restore, but since the restore

Stage 7: MySQL Restore

From the BCP NFS mount, the latest MySQL backup will be restored to the Squirro leader. The followers will replicate immediately to the same state.

...

The same mechanism is used to replicate from BCP to Productionproduction.
The  The best practice approach is to setup and test this scenario, but to not execute the script using e.g. cron automatically.

Once BCP become becomes active, the replication cron job on Production production is stopped and the script on BCP enabled.

For maximum safety , we recommend to separate both scenarios Production -> BCP and BCP -> Production into dedicated folders in the NFS mount. This way an accidental reversal of the direction cannot lead to unwanted and permanent data loss. 

Reduced number of nodes in BCP

...

Note that you should never run an even number of Squirro application and elasticsearch nodes , since both system benefit from the ability to build quorums to detect  and and handle network segmentation events.

...