Determine Number of Present Nodes
In a Galera cluster at least 2 nodes needs to be online for MySQL to be functional. To check the number of preset nodes in the cluster, on any of the server nodes, run this MySQL query:
SHOW STATUS LIKE 'wsrep_cluster_size';
You should get a result like this:
MariaDB [(none)]> SHOW STATUS LIKE 'wsrep_cluster_size';
+--------------------+-------+
| Variable_name | Value |
+--------------------+-------+
| wsrep_cluster_size | 3 |
+--------------------+-------+
1 row in set (0.001 sec)
This shows that there are 3 nodes present in the cluster.
Single Node
MySQL will continue to operate as a single node if the other nodes were shutdown gracefully. If they were not due to a network issue or a crash, then the single node will not accept queries and will return the following error:
ERROR 1047 (08S01): WSREP has not yet prepared node for application use
This error typically occurs whenever a node is isolated from the others (network problems, servers down, etc…) or the other nodes were shutdown ungracefully.
Bootstrap cluster from a single node
If you determine that no other nodes are present, and want to make this a primary node, you could run this query:
SET GLOBAL wsrep_provider_options='pc.bootstrap=YES';
Once the other nodes come back online or regain network connectivity with this node, they will sync up changes from this primary.
In an event that MySQL is stopped on all servers and you need to bring the cluster back up, to identify which node is in the most advanced state, look at the /var/lib/mysql/grastate.dat
file on each server and look at the seqno
. If the seqno
looks like something like this, it is probably the most advanced node:
# GALERA saved state
version: 2.1
uuid: 26f7bc7c-cca7-11e8-932c-06d13cbd32e4
seqno: 7
safe_to_bootstrap: 1
Start this node as the primary in the cluster by running:
galera_new_cluster
Bringing the Cluster Back Online
If all nodes have a seqno
value of -1
in the /var/lib/mysql/grastate.dat
file than they all were disconnected from each other due to some networking problem or all of them crashed. In this case, find the node that is most likely to have the most recent data and use it to bootstrap the cluster from. To do this:
First make sure no mysql processes are running on any of the nodes
killall -KILL mysql mysqld_safe mysqld mysql-systemd
2. On the node you decided to be the new primary edit the /var/lib/mysql/grastate.dat
file and set the safe_to_bootstrap
value to 1 so that it’ll look like something like this:
# GALERA saved state
version: 2.1
uuid: 26f7bc7c-cca7-11e8-932c-06d13cbd32e4
seqno: -1
safe_to_bootstrap: 1
Then run this command:
galera_new_cluster
3. Check to make sure the previous step started MySQL by running the MySQL prompt and check the number of nodes present in the cluster (should be 1 at this step) by running this MySQL query:
SHOW STATUS LIKE 'wsrep_cluster_size';
The output should look like this:
+--------------------+-------+
| Variable_name | Value |
+--------------------+-------+
| wsrep_cluster_size | 1 |
+--------------------+-------+
1 row in set (0.001 sec)
4. Now add the remaining nodes to the cluster, one by one. For large databases, the time it takes to sync up the data can exceed the systemd startup timeout. So for the initial re-joining to the cluster, we will not use systemd to start MySQL to allow for enough time to sync up all the data. So on the 2nd node, run this command to join the cluster:
mysqld --user=mysql --wsrep_cluster_address=gcomm://123.123.123.123
replace 123.123.123.123 with the actual IP of the node that is now the primary and running
Depending on the size of the databases, this may take some time to run. Once this node is sync, you’ll see a message like this:
...
WSREP: Synchronized with group, ready for connections
...
Verify that the primary node now shows the cluster size of 2 by running MySQL query:
SHOW STATUS LIKE 'wsrep_cluster_size';
After the 2nd node is fully synced, stop the mysql process by running:
mysqladmin shutdown
And then start mysql with systemd
systemctl start mysql
Make sure MySQL is running and that the cluster size is 2.
5. One the 2nd node is up, do the same on the 3rd node. If the 3rd node is a Galera Arbitrator, then start garb service and make sure the cluster size is showing 3 nodes.