Last time when i used MySQL 5.6 Database Replication from Master to Slave kind of things was pretty awesome, straight-forward, reliable and work as it is. I was using it for my first priority on backup database, and using mysqldump from the slave db as second backup priority, even though still doesn’t solve the issue of high available server application because it required a down time to switch between master to slave. I can leave it this issue for just for now, but I’m facing another issue when the application requires customization on the replication type of each tables.
For example we have to two databases need to be replicate, database A and B, database A has several tables to do the one way replication to database B, database B also has several tables need to the the replication one way to database A, and both databases has the remaining tables that need to be replicate bi-directional way. Database A as central server that sit somewhere in the cloud and it has public ip and can be access from every where; database B sit in local shop some where, it has limited and unreliable internet connection, and it doesn’t have any public ip. Database A as a central database has a big capacity CPU and Memory, it handles most of the transaction data, but once there is an record update relate to table that belong to database B, it will push only for the particular record. In this case the replication require functionality to do the filtering record and also communication some how with http or tcp pulling type because the database can be inaccessible from outside.
One of the solution out there is by using the SymmetricDS (https://www.symmetricds.org) replication framework, is an open source software for database synchronization with support for multi master replication, filtered synchronization, and transformation. It runs on java with minimal version 7, support for multi variant databases like MySQL, MongoDB, Microsoft SQL Server, Oracle, SQLite, H2, etc. SymmetricDS is using JDBC to communicate with external database.
The symmetricDS treat our application as a node, each node has their own presentation of their database, each node has unique id called External Id and it group by how it will replicate the data via http pull or http push. And this software will create their system tables which defined by put prefix sym_ inside our database, is make our database became messy, one thing that i don’t like about SymmetricDS.
The way of SymmetricDS works is by allowing the user to select which user’s table need to be replicate, and define how the table replication works. The SymmetricDS will scan the selected tables and trigger the replication when there is a data changed inside, the replication data it self has a previous and new value of the particular record and store as CSV Format. Until now, I don’t fully understand the detail about how the SymmetricDS validate and scan each of the table, but I have found it put all the change history , create a hash value and store into their system table.
After SymmetricDS capture the changed record, it will define the destination node based on the route table that we (user) defined at the beginning, in this case from database A to B, so we can called route a_2_b with source node A and target node B. SymmetricDS extract the data into CSV format and transform the record data when is need, and create as a batch data file before it transfer into the destination node by http transport. SymmetricDS is using Restful framework to do the communication with other node, so we don’t worry about the security any more for the data transportation by putting the ssl on the top of the communication, and after the batch message is received by destination node, it will load into their database.
The good thing about SymmetricDS because it run with Java, so it can run for any OS like Linux or Windows, it also has a small size of application sit in the memory. One of the example implementation by using this kind of technology is Point Of Sale / Cashier Machine that require to synchronize the data into the central database server.