This Admin’s Guide is written for people who are interested in setting up a RobustIRC network.
In case you have any questions that this document leaves unanswered, please file an issue on GitHub so that we can answer your question and improve the document.
1. Motivation
We started developing RobustIRC because we were annoyed enough by netsplits, but unsatisfied with alternatives to IRC. Netsplits can have many causes with the most common unforeseen one being actual networking-related trouble (packet loss/link unavailability). A foreseen but unavoidable one is software upgrades: with traditional IRC software, you are incentivized to upgrade the software (IRC server, operating system, kernel) on your IRC servers as rarely as possible, because every upgrade is an intrusive netsplit for the users of your IRC network.
We think this is unacceptable in today’s world. We want to use IRC with all its various clients and bots, but we also want to be able to upgrade any server when we feel like it, and we don’t want to endure netsplits whenever a single TCP connection is flaky.
2. High-level Overview
RobustIRC is a distributed system which exposes standard IRC (as per RFC2812) via the RobustSession protocol, which gracefully handles TCP connection problems.
Contrary to traditional IRC, where multiple servers are linked together to form an IRC network, RobustIRC provides one virtual IRC server that is distributed onto multiple machines.
RobustIRC uses the Raft consensus algorithm to be resilient against server failures. Each message sent by the clients is first persisted using Raft, then processed by each RobustIRC server.
2.1. Failure Modes
This chapter describes how RobustIRC behaves when faced with various failure scenarios. We assume that each RobustIRC node is running on a different machine, ideally on a different physical machine, as otherwise a failure of the host machine will lead to not only a single-node failure, but a multiple-node failure.
The number of nodes necessary to establish quorum depends on the number of nodes in the network. As an example, in a 3-node network, you need \(\left\lfloor \frac{n}{2} \right\rfloor+1 = 2\) nodes to reach quorum. In a 5-node network, you need \(\left\lfloor \frac{n}{2} \right\rfloor+1 = 3\) nodes to reach quorum.
2.1.1. Node Failures
Node failure means the node does not reply to heartbeats anymore. This could happen for example because the server process gets killed, because the machine is powered off, because the machine drops off the network, or many other reasons.
In case the node was a follower, clients which were receiving messages from that node (GetMessages request) need to connect to a different node before they can receive any new messages. How long the clients need to realize that the node has failed depends on the specific failure mode. In the worst case, clients will timeout a connection after not receiving any data for 1 minute.
In case the node was the leader, the previous paragraph also applies. Additionally, clients will not be able to send new messages until a new leader is elected. Depending on the network configuration, this typically takes between 5 and 10 seconds, provided that there are enough nodes that a quorum can be established. If there are not, the network will remain frozen/read-only until enough nodes are available.
2.1.2. Network Partition
Network partition means that some link becomes unavailable, for example because a physical cable is removed, and the network is partitioned into multiple parts.
In case the link between the client and any given server within the network becomes unavailable, the client will just switch to a different server transparently. Direct contact to the current leader is not required, since followers will proxy requests for the client when necessary.
In case the link between servers becomes unavailable, the previous section („Node Failures“) applies for the server which now became unavailable. In case the RobustIRC network is structured in such a way that a single link unavailability affects multiple servers, the problem may be more severe: if the servers within each network partition cannot reach quorum, the network is frozen. As an example, in a network with 2 nodes each in 2 datacenters, the network must freeze when the connection between data centers becomes unavailable. However, if you had 5 nodes, you would need at least two such link failures at the same time to be unable to reach quorum.
3. Network Performance
RobustIRC’s performance will typically be enough for small networks (a couple of hundred of users). In case you are running a bigger network, this section explains what’s relevant with regards to RobustIRC’s performance.
When talking about RobustIRC performance, these are the dimensions of interest:
- messages/s
-
This means messages per second for the entire IRC network, where message is an IRC protocol message, i.e. including nickname changes (not just
PRIVMSG
). - sessions
-
The number of sessions. Each user who uses the bridge to connect to your RobustIRC network is one session. In traditional IRC networks, one session would be one TCP connection.
- channels
-
The number of channels is unlikely to be a scalability problem. However, the number of sessions participating in a channel, plus the messages/s within that channel, are an interesting dimension. Each message to a channel needs to be handed out to every participant in that channel. Therefore, performance goes down the more messages go into a channel, and it goes down faster the more sessions participate in each channel.
3.1. Ballpark numbers
If the IRC network you’re running is small, chances are you don’t have a good idea of how many messages/second your network is handling. There are a couple of ways to get estimates:
mibbit has a list of networks, and you can also see their non-secret channels. Looking at the channels of the biggest networks, there are typically about 300 users in each channel. Of course, there are outlier channels with 3000+ users, which typically host warez or offer some other kind of automated content.
freenode cites 320 GiB/month as an estimate for the traffic required to run a server in the freenode network. If you assume an average message size of 100 bytes (the maximum being 512 bytes), this translates to roughly \(\frac{320 * 1024^3}{30 * 24 * 60 * 60} / 100 = 1325\) messages/s.
netsplit.de has a list of the biggest networks (excluding some that don’t want to be counted, like freenode). The number of users for the top 10 range from 10,000 to 50,000 users.
We also monitored IRCNet for a week and observed an average number of messages of about 2000 messages/s.
3.2. Performance
TODO
3.3. Latency
TODO
4. Clients
4.1. Bridge
The RobustIRC bridge is a program which bridges (translates) between the RobustIRC protocol and standard IRC, as defined per RFC2812.
There are two places where the bridge can run, each with their own benefits and drawback:
- Bridge runs on IRC client machine (recommended)
-
Single-server unavailability and network partitions will be handled transparently by the bridge. See Failure Modes for details on the failure modes.
This is the recommended mode, but requires users to install the bridge on their machine(s).
- Bridge runs on network servers
-
Typically, RobustIRC networks will provide a bridge. The recommended hostname is
legacy-irc.<networkname>
, e.g.legacy-irc.robustirc.net
.The advantage is that users can directly connect to your network, but the bridges are single points of failures: in case a bridge server goes down, the users connected to it will be disconnected from the RobustIRC network.
4.1.1. SOCKS5
When running a bridge on the same machine as your IRC client, you’d run it using:
robustirc-bridge -socks=localhost:1080
Then, configure localhost:1080
as the SOCKS5 proxy address to use for
connecting to a network in your IRC client.
/proxy add bridge socks5 localhost 1080
/server add robustirc robustirc.net /set irc.server.robustirc.proxy bridge /connect robustirc
4.1.2. IRC proxy
In case your IRC client does either not support SOCKS5 at all or does not support per-network proxy configuration (e.g. irssi), you can use the bridge in IRC proxy mode. The downside is that you need to run one bridge instance per RobustIRC network you want to connect to.
After starting the bridge with robustirc-bridge -network=<network>
, you can
configure localhost:6667
as IRC server in your client.
robustirc-bridge -network=robustirc.net
/network add robustirc /server add -auto -network robustirc localhost 6667 /connect robustirc
Depending on your network connection, it might make sense to disable lag checking so that longer periods of network unavailability can be survived without forcing a disconnect (see section 5.9 of the irssi manual):
/set lag_check_time 0
5. Server Deployment
5.1. Number and Location
For running a RobustIRC network, you need at least 3 different servers. While technically you can run 3 RobustIRC processes on the same server, that doesn’t make a lot of sense: the point of RobustIRC is to be resilient to certain failures, and when you put all your RobustIRC processes into the same failure domain, you don’t achieve that.
The ideal configuration for RobustIRC is to have each server in an entirely separate failure domain, i.e. on a different machine, in a different rack, with different power, with different network connectivity, in a different physical datacenter. For traditional hosting this typically means chosing different hosting providers, with cloud providers it means running in different availability zones.
That said, pay attention to the network latency between your failure domains. See Latency for how to determine the network latency and what it means.
With regards to the number of servers, a network of 3 servers continues to work when 1 server is unreachable. A network of 5 servers continues to work when 2 servers are unreachable, and so on. In general, a network of \(n\) servers continues to work when \(n - (\left\lfloor \frac{n}{2} \right\rfloor+1)\) servers are unreachable.
Therefore, always use an odd number of servers in your network. Even numbers don’t increase the reliability, so they only increase the message commit latency due to increased quorum size.
5.2. Preparation
- SSL for servers
-
You need a valid SSL certificate for every server you want to use in your network. This can be a single wildcard certificate, or a certificate with subject alternative names.
Example output for correctly installed SSL certificate:$ echo | openssl s_client -connect alp.robustirc.net:60667 | grep 'Verify return code' Verify return code: 0 (ok)
- DNS entry for servers
-
Each server must have a public DNS entry, i.e. a AAAA record and preferably also an A record. You will need to make each node aware of its own public DNS entry (e.g. “dock0.robustirc.net”) by specifying it in the
-peer_addr
flag when starting RobustIRC. This serves two purposes: it provides a unique identifier for Raft to identify the node, and at the same time describes where to connect to.Example output for correctly set up DNS records:$ host alp.robustirc.net alp.robustirc.net has address 46.20.246.99 alp.robustirc.net has IPv6 address 2a02:2528:503:2::2
- DNS for the network
-
You need to create an SRV DNS record pointing to each host/port on which RobustIRC is running on. This record (e.g. “robustirc.net”) will be used by clients to connect to your network.
Example output for a correctly set up DNS record:$ dig +short -t SRV _robustirc._tcp.robustirc.net 0 0 60667 dock0.robustirc.net. 0 0 60667 alp.robustirc.net. 0 0 60667 ridcully.robustirc.net.
5.3. Status page
RobustIRC provides a status page that you can access with your web browser.
Simply connect to the host/port on which the server is listening (see the
-peer_addr
flag) and use “robustirc” as user with -network_password
as
password when asked to authenticate.
As an example, assume you’re running a node with
-peer_addr=alp.robustirc.net:60667
and -network_password=topsecret
. The URL
for the status page is https://robustirc:[email protected]:60667/
The same port is used for the status page, communication between the nodes and communication with the clients.
5.4. Bootstrapping
When bringing up your network for the first time, you need to run each node
with a special command line parameter: the first node you start needs the
-singlenode
flag, and all other nodes need the -join=<address>
flag, where
<address>
is the -peer_addr
value of the first node.
Bootstrapping is finished once the network converged, meaning all Status pages display a node state of either “Follower” or “Leader” (as opposed to “Candidate” or “<nil>”).
Once bootstrapping is finished, be sure to remove the -singlenode
and -join
flags! Afterwards, see Updates for how to use robustirc-rollingrestart
to restart each node once. Remember to use systemctl daemon-reload
to make
changes to service files effective if you’re using systemd. Neglecting to
remove the flags could lead to data loss or split brain scenarios (for
-singlenode
) or unavailability after the list of peers changes (for -join
).
Again, NEVER use the -singlenode
flag after the initial bootstrapping.
5.5. Installing & Running
We strongly recommend using Docker since it makes running RobustIRC much easier.
5.5.1. Docker
You can use the official docker container “robustirc/robustirc” that we provide.
We run two of our servers on CoreOS, which provides quite a restricted environment, so we describe that setup in the hope that you can easily adapt it.
In the example systemd service file below, /media/persistent
is the path on
which we have mounted our persistent storage. We use it to load the TLS
key/certificate from and store the RobustIRC state.
Note that the TLS key/certificate need to be readable by uid 99, and
state needs to be writable by uid 99. The uid corresponds to the nobody
unprivileged user which is used in RobustIRC’s Dockerfile.
Furthermore, the node runs on the public port 60667
, which reminds of the
conventional 6667
IRC port, but is in the dynamic range. Via -peer_addr
,
the node’s public address is provided to RobustIRC. This is necessary as docker
uses a private network within the container.
[Unit] Description=RobustIRC After=docker.service Requires=docker.service [Service] # So that the robustirc-updater can trigger /quit to restart the node. Restart=always StartLimitInterval=0 # Always pull the latest version (bleeding edge). ExecStartPre=/usr/bin/docker pull robustirc/robustirc:latest ExecStart=/usr/bin/docker run \ -v /media/persistent:/media/persistent:ro \ -v /media/persistent/robustirc:/var/lib/robustirc \ -p :60667:8443 \ robustirc/robustirc:latest \ -tls_cert_path=/media/persistent/ssl/combined.crt \ -tls_key_path=/media/persistent/ssl/robustirc.net.startssl.key \ -network_password=<secret> \ -network_name=robustirc.net \ -peer_addr=dock0.robustirc.net:60667 [Install] WantedBy=multi-user.target # So that a stop/start of docker will also start RobustIRC again. WantedBy=docker.service
5.5.2. From source
After installing the Go compiler from your distribution’s packages, run:
$ export GOPATH=~/gocode $ go get github.com/robustirc/robustirc/...
You’ll end up with all RobustIRC binaries installed in ~/gocode/bin/
.
5.6. Updates
The Docker container we provide always has a “stable” tag pointing at the most recently released version that was tested for at least 7 days on the robustirc.net network. We recommend you follow the “stable” tag, but if you prefer, you can directly follow the “latest” tag instead.
In order to update to a newer version, all you need to do is run the newer
RobustIRC binary. You will never need to manually migrate the on-disk data.
This allows you to do automatic or semi-automatic updates: you could use an
ExecStartPre
directive to automatically pull the new docker container as
outlined in the Docker section above. If you chose to not use docker, the
equivalent action would be to install the new RobustIRC binary on all nodes.
5.6.1. Rolling restart
To make the switch to the new RobustIRC binary easier, there is a tool called
robustirc-rollingrestart
. It quits each node, expecting the node to
automatically be restarted and pick up the target binary version. Network
health is taken into account before quitting a node, so if the update is
unsuccessful for whichever reason, you will lose one node at most and your
network as a whole will still work. Updates by robustirc-rollingrestart
are
unobtrusive; users cannot tell that the update even happened.
$ robustirc-rollingrestart -binary_path=$PWD/robustirc -network=robustirc.net -network_password=secret 21:38:51 Checking network health 21:38:51 Restarting "robustirc.net" nodes until their binary hash is b3d57b4153ee3dca93b3c8d8f787eccb 21:38:51 Killing node "ridcully.robustirc.net:60667" 21:38:51 Post https://ridcully.robustirc.net:60667/quit: EOF 21:38:52 Get https://ridcully.robustirc.net:60667/: dial tcp 78.46.97.235:60667: connection refused […] 21:39:17 Node "ridcully.robustirc.net:60667" has not yet applied all messages it saw before, waiting (got 282270, want ≥ 285055) 21:39:27 Node "ridcully.robustirc.net:60667" was upgraded and is healthy again 21:39:27 Skipping "alp.robustirc.net:60667" which is already running the requested version 21:39:27 Killing node "dock0.robustirc.net:60667" 21:39:27 Quitting "dock0.robustirc.net:60667": Post https://dock0.robustirc.net:60667/quit: EOF […] 21:40:34 Network is not healthy: Server "dock0.robustirc.net:60667" was last contacted by the leader at 0001-01-01 00:00:00 +0000 UTC, which is over a second ago 21:40:35 Network became healthy. 21:40:35 All done!
5.6.2. Canary mode
In order to make sure that a new version of RobustIRC processes the on-disk input messages the same way as the version you’re currently running, you can use canary mode. Canary mode connects to the network, downloads all input messages and their corresponding output, re-processes the messages locally and then generates a report with all the differences. It never joins the network, so it doesn’t modify any state.
We expect canary mode to only be used by developers, but include it in this document for completeness.
5.7. Adding nodes
To add a new node to the network, for example in preparation of decomissioning
a server, start it with specifying the -peer_addr
value of any healthy server
in the -join
flag.
As an example, if your network consists of these three healthy nodes:
-
a node running with
-peer_addr=alp.robustirc.net:60667
-
a node running with
-peer_addr=dock0.robustirc.net:60667
-
a node running with
-peer_addr=ridcully.robustirc.net:60667
…and you wanted to introduce a node running with
-peer_addr=libri.robustirc.net:60667
, you can start that node with
-join=alp.robustirc.net:60667
.
As explained in Bootstrapping, only specify -join
once, then remove the
flag again!
5.8. Removing nodes
To remove an existing node from the network, for example before decomissioning
a server, use the robustirc-removepeer
tool, e.g.:
robustirc-removepeer \ -network=robustirc.net \ -network_password=secret \ -remove_peer=alp.robustirc.net:60667
The robustirc-removepeer
tool has safety checks in place to make sure that
the network can still reach quorum after removing the node you ask it to
remove.
5.9. Upgrading to the Raft protocol version 3
Since RobustIRC was released, the hashicorp/raft
package which RobustIRC is
using has introduced a new protocol version.
To upgrade your network from Raft protocol version 1 (the default) to version 3 (the new version), use the following steps:
-
Take down one node. Bring it back up again with the
-raft_protocol_version=2
flag. Repeat until all nodes are on version 2. -
Repeat step 1 with
-raft_protocol_version=3
.
5.10. Healing partitions
In case your network becomes partitioned, you have two options:
-
You do nothing and just wait until the partition is over. This strategy is typically only suitable for partition root causes which are well understood and short-lived in their nature, e.g. planned maintenance on a network switch.
-
You first remove the node(s) affected by the partition from the network and then replace them with healthy nodes. See Removing nodes and Adding nodes, respectively.
Note that it is only safe to remove and add nodes as long as the network still
has a raft leader, i.e. as long as the majority of nodes are healthy. The
robustirc-removepeer
tool described in section Removing nodes
automatically checks that for you.
In particular, DO NOT USE the -singlenode
and -join
flags to introduce
new nodes to the network, or you might end up in a
split-brain scenario.
5.11. Monitoring
If you want to provide a stable network, you should strive to keep all nodes of the network healthy. In technical terms, you want to keep your capacity at least at n+1, where n is the number of nodes you absolutely need. In an example with 3 nodes, while the network still functions with only 2 nodes being healthy, it is advisable to always try to have 3 nodes healthy, otherwise a single hardware failure will make your network freeze.
In order to detect unhealthy nodes, we recommend using Prometheus. Please refer to the Prometheus documentation for details.
RobustIRC exports Prometheus metrics by default and we provide example config files:
-
contrib/prometheus/prometheus.conf is an example Prometheus configuration file.
-
contrib/prometheus/robustirc_prometheus.rules is an example Prometheus rules file.
-
contrib/prometheus/alertmanager.conf is an example alertmanager configuration file.
For the robustirc.net network, we use Prometheus in combination with Pushover to get alerted about problems and we try to fix them swiftly.
As for graphs, we provide an example dashboard JSON file for Grafana.
6. Network configuration
Since the network configuration can influence the IRC output messages (e.g.
whether the OPER
command succeeds), it is persisted via Raft like any other
RobustIRC message.
We use TOML as configuration language.
To edit the network configuration, use the robustirc-editconfig
command-line
tool, which will spawn an $EDITOR
with the current config and update the
config once you leave the editor. You’ll need to specify the -network
flag
and the -network_password
flag:
$ robustirc-editconfig -network=robustirc.net -network_password=secret
6.1. Example (annotated)
See the next subsections for detailed descriptions of each of these parameters.
# Sessions without activity are expired after half an hour. SessionExpiration = "30m0s" # Messages are (eventually) throttled to 2 messages/s. PostMessageCooloff = "500ms" [IRC] [[IRC.Operators]] Name = "foo" Password = "bar" [[IRC.Services]] Password = "mypass"
6.2. General parameters
- SessionExpiration
-
Sessions expire after no activity for this duration. Defaults to 30 minutes. Note that the bridge sends a PING message (which counts as activity) after 1 minute of inactivity. With the default value of 30 minutes, a network outage lasting less than 30 minutes can be recovered from.
- PostMessageCooloff
-
RobustIRC uses throttling that ramps up exponentially from 1ms to the specified duration. As an example, the enforced delays between four messages are 1ms, 2ms, 4ms, etc. This pattern continues until it reaches
PostMessageCooloff
(defaulting to 500ms). The throttling starts over at 1ms when there was no activity forPostMessageCooloff
.This approach was chosen because it does not throttle actual users too aggressively but still becomes effective quickly when attackers start flooding.
Set
PostMessageCooloff
to 0 to disable any throttling (not recommended!).
6.3. IRC parameters
6.3.1. Operators
- Name, Password
-
These two parameters specify the name and password that need to be specified in the
OPER
command to become an IRC operator. Typically, name correlates with the IRC nickname of the person who should be granted IRC Operator privileges.
6.3.2. Services
- Password
-
Specifies a password with which you can link IRC services (e.g. anope) to RobustIRC. Note that this is not full server-to-server support and will not be extended to become that. Only the bare minimum server-to-server protocol was implemented to get services working.
7. Raft Configuration
7.1. Timeouts
If you run your network really hot and notice that leadership is often lost, you need to increase these timeouts to allow for more slack. Otherwise, you can ignore this entire section.
Timeouts must fulfill this relation:
5ms < LeaderLeaseTimeout ≤ HeartbeatTimeout ≤ ElectionTimeout
default: 500ms (LeaderLeaseTimeout) ≤ 1000ms (HeartbeatTimeout) ≤ 1000ms (ElectionTimeout)
- LeaderLeaseTimeout
-
a leader steps down (i.e. does not consider itself the leader anymore) after it was unable to contact a quorum of nodes for LeaderLeaseTimeout.
- HeartbeatTimeout
-
followers enter the candidiate state once they have not heard from a leader within HeartbeatTimeout. The leader delays for a random value within [HeartbeatTimeout/10, HeartbeatTimeout/10 * 2] between each ping to its followers. Therefore, at least 5 (but possibly up to 10) heartbeats must be missed before HeartbeatTimeout is reached.
- ElectionTimeout
-
candidates restart the voting process after ElectionTimeout.
- CommitTimeout
-
See hashicorp/raft issue #28 for details.
As a rule of thumb, figure out the latency of the slowest network link between
your nodes, e.g. by using ping(8)
. Then, set HeartbeatTimeout
to 10 times
that latency so that brief network latency spikes are not a problem. Set all
the other timeouts to the same value.
7.2. MaxAppendEntries
TODO
7.3. SnapshotThreshold
TODO
7.4. TrailingLogs
TrailingLogs is the number of Raft log entries which are kept after taking a snapshot. If you have enough log entries to cover a brief node failure (e.g. a flaky network), Raft does not need to send an entire snapshot over the network, so recovery of the failed node may be quicker.
In case you configure this parameter too low, recovery after a node failure may consume more bandwidth and may take longer.
In case you configure this parameter too high, the disk usage of RobustIRC will be higher, as log compactions will occur less frequently.
As a rule of thumb: look at your network’s messages/second, multiply that by the time of a typical outage, e.g. 5 minutes.
8. Common Issues
8.1. Raft leader flapping
If leadership flaps, consult the Raft Configuration section to verify that your timeouts are set appropriate for the network situation at hand.
In case your timeouts are configured correctly, it’s worth checking whether the leadership flaps are correlated in time with periods of high CPU or disk utilization. In that case, if possible, try increasing the available resources. For example, allow the VM in which RobustIRC runs to use a second CPU core in case only one was allocated.