Unlike many replication engines, Tungsten Replicator can run multiple replication services concurrently. There is a central management interface that allows you to start new replication services without disturbing services that are already running. Each replication service also has its own management interface so that you can put the loaded service online, offline, etc. without disturbing other replication work. As Tungsten is written in Java, the management interfaces are based on JMX, a standard administrative interface for Java applications.
Here is a simple diagram that shows a Tungsten Replicator with two replication services named fra and nyc that replicate from separate DBMS masters in Frankfurt and NYC into a single slave in San Francisco. You can immediately see the power of replication services--a single Tungsten Replicator process can simultaneously replicate between several locations. Replication services are an important building block for the type of complex setups that Giuseppe Maxia discussed in his blog article on Replication Topologies.
Users who are handy with Java can write their own programs to manipulate the JMX interfaces directly. If not, there is the trepctl utility, which is supplied with the Tungsten Replicator and works off the command line.
If the Tungsten Replicator architecture reminds you of a Java application server, you are absolutely right. Java VMs have a relatively large resource footprint compared to ordinary C programs, so it is typically more efficient to put multiple applications in a single VM rather than running a lot of individual Java processes. Tungsten replication services follow the same design pattern, except that instead of serving web pages they replicate database transactions.
Let's now look a little more deeply at how Tungsten Replicator organizes replication services. Each replication service runs a single pipeline, which is Tungsten parlance for a configurable message flow. (For more on pipelines, read here.) When the service starts, it loads an instance of a Java class called OpenReplicatorManager that handles the state machine for replication (online, offline, etc.) and provides the management interfaces for the services. The OpenRepicatorManager instance in turn depends on a number of external resources from the file system and DBMS.
Here is another diagram showing how Tungsten Replicator organizes all of the various parts for services. Services need a configuration file for the pipeline, as well as various bits of disk space to store transaction logs and replication metadata. The big challenge is to ensure things do not accidentally collide.
This layout seems a bit complex at first but is reasonably simple once you get used to it. Let's start with service configuration using our fra service as an example.
Service configuration files are stored in the tungsten-replicator/conf directory. There are up to two files for each service. The static-fra.properties file defines all properties of the service, pipeline organization and properties like the replication role or master address that may change during operation. The dynamic-fra.properties contains overrides to selected properties. For instance, if you switch the replication role from slave to master as part of a failover operation, it goes in the dynamic-fra.properties file. Tungsten Replicator reads the static file first, then applies the overrides when it starts the service.
Next, we have Tungsten transaction logs, also known as the Transaction History Log. This is a list of all transactions to be replicated along with metadata like global transaction IDs and shard IDs. THL files for each service are normally stored in the logs directory at the same level as the tungsten release directory itself. There is a separate directory for each service, as for example logs/fra.
Next we have Tungsten relay logs. These are downloaded binlogs from a MySQL master DBMS from which the replication service creates the Tungsten transaction logs. Not every replication service uses these. They are required when the MySQL master is on another host, or the binlogs are on an NFS-mounted file system, which Tungsten does not parse very efficiently yet. Tungsten relay logs use the same pattern as the THL--everything is stored under relay-logs with a separate subdirectory for each service, for example relay-logs/fra.
Finally, there is metadata in the DBMS itself. Each replication service has a database that it uses to store restart points for replication (table trep_commit_seqno) as well as heartbeats and consistency checks (tables heartbeat and consistency, respectively). The name of this database is tungsten_
Setting up services is difficult to do manually, so Tungsten Replicator 2.0 has a program named configure-service that defines new replication services and removes old ones by deleting all traces including the database catalogs. You can find out all about installation and starting services by looking the Tungsten Replicator 2.0 Installation and Configuration Guide, which is located here.
Services have been part of Tungsten Replicator for a while but we have only recently begun to talk about them more widely as part of the release of Tungsten Replicator 2.0.0 in February 2011, especially as we are start to do more work with multi-master topologies. One of the comments we get is that replication services make Tungsten seem complicated and therefore harder to use, especially compared with MySQL replication, which is relatively easy to set up. That's a fair criticism. Tungsten Replicator is really a very configurable toolkit for replication and does far more than MySQL replication or just about any other open source replicator for that matter. Like most toolkits, the trade-off for power is complexity.
We are therefore working on automating as much of the configuration as possible, so that you can set up even relatively complex topologies with just a couple of commands. You'll see more of this as we make additional replicator releases (version 2.0.1 will be out shortly) and push features fully into open source. Meanwhile, if you have comments on Tungsten 2.0.0 please feel free to post them back to us.