...
The guide assumes that you have SSH access to the Squirro servers and the password to access mysqlMariaDB.
Identify the current master
For each broken slave, stop the Squirro cluster service:
Code Block sudo systemctl stop sqclusterd
Ensure you have a running system with one cluster node.
At the master:
Code Block mysql> RESET MASTER; Query OK, 0 rows affected (0.14 sec) mysql> FLUSH TABLES WITH READ LOCK; Query OK, 0 rows affected (0.00 sec) mysql> SHOW MASTER STATUS; +------------------+----------+--------------+------------------+ | File | Position | Binlog_Do_DB | Binlog_Ignore_DB | +------------------+----------+--------------+------------------+ | mysql-bin.000001 | 12268 | | | +------------------+----------+--------------+------------------+ 1 row in set (0.00 sec) mysql> exit $ mkdir /var/lib/squirro/cluster/mysql/<date> $ mysqldump --host=127.0.0.1 --port=3306 --user=cluster --password=$(PASSWORD) --all-databases --master-data --single-transaction --result-file /var/lib/squirro/cluster/mysql/<date>/dump.db $ mysql -u root -p mysql> UNLOCK TABLES; mysql> exit $ cd /var/lib/squirro/cluster/mysql/<date>/ $ gzip dump.db $ scp dump.db.gz $(SSH_USER)@$(IP_SLAVE1):/tmp/ $ scp dump.db.gz $(SSH_USER)@$(IP_SLAVE2):/tmp/
On each slave:
Code Block $ mkdir /var/lib/squirro/cluster/mysql/restore-<date> $ mv /tmp/dump.db.gz /var/lib/squirro/cluster/mysql/restore-<date> $ cd /var/lib/squirro/cluster/mysql/restore-<date> $ gunzip dump.db.gz $ mysql -u root -p mysql> stop slave; mysql> exit $ mysql --host=127.0.0.1 --port=3306 --user=cluster --password=$(PASSWORD) -e "source dump.db;" $ mysql -u root -p mysql> RESET SLAVE; mysql> change master to master_user='repl', master_password='PASSWORD'; mysql> show slave status \G; --> if master_host does not point to current master, e.g.: mysql> change master to master_host = "$(IP_MASTER)"; mysql> CHANGE MASTER TO MASTER_LOG_FILE='mysql-bin.000001', MASTER_LOG_POS=12268; --> ensure that the position matches the value from SHOW MASTER STATUS; above mysql> START SLAVE; mysql> show slave status \G; --> ensure that Slave_IO_Running: Yes and Slave_SQL_Running: Yes $ sudo systemctl start sqclusterd $ monit summarysquirro_status