corosync-qdevice - QDevice daemon
corosync-qdevice [-dfh] [-S
option=value[,option2=value2,...]]
corosync-qdevice is a daemon running on each node of a
cluster. It provides a configured number of votes to the quorum subsystem
based on a third-party arbitrator's decision. Its primary use is to allow a
cluster to sustain more node failures than standard quorum rules allow. It
is recommended for clusters with an even number of nodes and highly
recommended for 2 node clusters.
- -d
- Forcefully turn on debug information without the need to change
corosync.conf. For bumping syslog messages priority to info, use this
parameter twice.
- -f
- Do not daemonize, run in the foreground.
- -h
- Show short help text
- -S
- Set advanced settings described in its own section below. This option
shouldn't be generally used because most of the options are not safe to
change.
corosync-qdevice reads its configuration from corosync.conf
file.
The main configuration is within quorum.device sub-key.
Each model also has its own configuration within a similarly named
sub-key.
- model
- Specifies the model to be used. This parameter is required.
corosync-qdevice is modular and is able to support multiple
different models. The model basically defines what type of arbitrator is
used. Currently only net is supported.
- timeout
- Specifies how often corosync-qdevice should call the
votequorum_qdevice_poll function. It is also used by the net model
to adjust its hearbeat timeout. It is recommended that you don't change
this value. Default is 10000.
- sync_timeout
- Specifies how often corosync-qdevice should call the
votequorum_qdevice_poll function during a sync phase. It is recommended
that you don't change this value. Default is 30000.
- votes
- The number of votes provided to the cluster by qdevice. Default is
(number_of_nodes - 1) or generally sum(votes_per_node) - 1.
quorum.device.heuristics subkey holds the configuration of
the heuristics. Heuristics are set of commands executed locally on startup,
cluster membership change, successful connect to corosync-qnetd and
optionally also at regular times. Commands are executed in parallel. When
all commands finish successfully (their return error code is zero) on time,
heuristics have passed, otherwise they have failed. The heuristics result is
sent to corosync-qnetd and there it's used in calculations to
determine which partition should be quorate.
- timeout
- Specifies maximum time in milliseconds how long corosync-qdevice
waits till the heuristics commands finish. If some command doesn't finish
before the timeout, it's killed and heuristics fail. This timeout is used
for heuristics executed at regular times. Default value is half of the
quorum.device.timeout, so 5000.
- sync_timeout
- Similar to quorum.device.heuristics.timeout but used during membership
changes. Default value is half of the quorum.device.sync_timeout,
so 15000.
- interval
- Specifies interval between two regular heuristics execution. Default value
is 3 * quorum.device.timeout, so 30000.
- mode
- Can be one of on, sync or off and specifies mode of
operation of heuristics. Default is off, which means heuristics are
disabled. When sync is set, heuristics are executed only during
startup, membership change and when connection to corosync-qnetd is
established. When heuristics should be running also on regular basis, this
option should be set to on value.
- exec_NAME
- defines executables. NAME can be arbitrary valid cmap key name
string and it has no special meaning. The value of this variable must
contain a command to execute. The value is parsed (split) into arguments
similarly as Bourne shell would do. Quoting is possible by using backslash
and double quotes.
quorum.device.net subkey holds the configuration for
model net.
- tls
- Can be one of on, off or required and specifies if
tls should be used. on means a connection with TLS is attempted
first, but if the server doesn't advertise TLS support then non-TLS will
be used. off is used then TLS is not required and it's then not
even tried. This mode is the only one which doesn't need a properly
initialized NSS database. required means TLS is required and if the
server doesn't support TLS, qdevice will exit with error message. Default
is on.
- host
- Specifies the IP address or host name of the qnetd server to be used. This
parameter is required.
- port
- Specifies TCP port of qnetd server. Default is 5403.
- algorithm
- Decision algorithm. Can be one of the ffsplit or lms.
(actually there are also test and 2nodelms, both of which
are mainly for developers and shouldn't be used for production clusters).
For a description of what each algorithm means and how the algorithms
differ see their individual sections. Default value is
ffsplit.
- tie_breaker
- can be one of lowest, highest or valid_node_id (number)
values. It's used as a fallback if qdevice has to decide between two or
more equal partitions. lowest means the partition with the lowest
node id is chosen. highest means the partition with highest node id
is chosen. And valid_node_id means that the partition containing the node
with the given node id is chosen. Default is lowest.
- connect_timeout
- Timeout when corosync-qdevice is trying to connect to
corosync-qnetd host. Default is 0.8 *
quorum.device.timeout.
- force_ip_version
- can be one of 0|4|6 and forces the software to use the given IP
version. 0 (default value) means IPv6 is preferred and IPv4 should
be used as a fallback.
- keep_active_partition_tie_breaker
- Can be one of on or off and specifies if keep active
partition tie breaker should be used. When this option is enabled and tie
happens QNetd will prefer partition with members of previously active
(quorate) partition. This is hard-coded behavior of LMS algorithm so this
setting affects only FFSplit algorithm. Default is on.
Logging configuration is within the logging directive.
corosync-qdevice parses and supports only debug option. The
logger_subsys sub-directive can be also used if subsys is set
to QDEVICE.
For corosync-qdevice to work correctly, the nodelist
directive has to be used and properly configured. Also the net model
requires that totem.cluster_name option is set.
For model net to work using TLS, it's necessary to
create the NSS database, import Qnetd CA certificate, and get/distribute a
valid client certificate.
If pcs is used (recommended) the following steps are not needed
because pcs does them automatically.
corosync-qdevice-net-certutil is the tool to perform
required actions semi-automatically. Please consult the help output of it
and its man page. For a first time configuration it may make sense to start
with the -Q option.
If TLS is not required just edit corosync.conf file and set
quorum.device.net.tls to off.
Depending on configuration of NSS (stored in nss.config file
usually in /etc/crypto-policies/back-ends/ directory) disabled ciphers or
too short keys may be rejected. Proper solution is to regenerate NSS
databases for both corosync-qnetd and corosync-qdevice
daemons. As a quick workaround it's also possible to set environment
variable NSS_IGNORE_SYSTEM_POLICY=1 before running
corosync-qdevice daemon.
When NSS is updated it may also be needed to upgrade database into
new format. There is no consensus on recommended way, but following command
seems to work just fine (if qdevice sysconfdir is set to /etc)
# certutil -N -d /etc/corosync/qdevice/net/nssdb -f /etc/corosync/qdevice/net/nssdb/pwdfile.txt
Algorithms are used to change behavior of how
corosync-qnetd provides votes to a given node/partition. Currently
there are two algorithms supported.
- ffsplit
- This one makes sense only for clusters with an even number of nodes. It
provides exactly one vote to the partition with the highest number of
active nodes. If there are two exactly similar partitions, it provides its
vote to the partition with higher score. The score is computed as
(number_of_connected_nodes +
number_of_connected_nodes_with_passed_heuristics -
number_of_connected_nodes_with_failed_heuristics) If the scores are equal,
the vote is provided to partition with the most clients connected to the
qnetd server. If this number is also equal, then the tie_breaker is used.
It is able to transition its vote if the currently active partition
becomes partitioned and a non-active partition still has at least 50% of
the active nodes. Because of this, a vote is not provided if the qnetd
connection is not active.
To use this algorithm it's required to set the number of votes
per node to 1 (default) and the qdevice number of votes has to be also
1. This is achieved by setting quorum.device.votes key in
corosync.conf file to 1.
- lms
- Last-man-standing. If the node is the only one left in the cluster that
can see the qnetd server then we return a vote.
If more than one node can see the qnetd server but some nodes
can't see each other then the cluster is divided up into 'partitions'
based on their ring_id and this algorithm returns a vote to the
partition with highest heuristics score (computed the same way as for
the ffsplit algorithm), or if there is more than 1 partition with
equal scores, the largest active partition or, if there is more than 1
equal partition, the partition that contains the tie_breaker node
(lowest, highest, etc). For LMS to work, the number of qdevice votes has
to be set to default (so just delete quorum.device.votes key from
corosync.conf).
Define qdevice with net model connecting to qnetd running
on qnetd.example.org host, using ffsplit algorithm. Heuristics is set
to sync mode and executes two commands.
quorum {
provider: corosync_votequorum
device {
votes: 1
model: net
net {
tls: on
host: qnetd.example.org
algorithm: ffsplit
}
heuristics {
mode: sync
exec_ping: /bin/ping -q -c 1 "www.example.org"
exec_test_txt_exists: /usr/bin/test -f /tmp/test.txt
}
}
corosync-qdevice-tool(8)
corosync-qdevice-net-certutil(8) corosync-qnetd(8)
corosync.conf(5) votequorum_qdevice_poll(3)