Chapter 6. Replication in MySQL

Table of Contents

6.1. Introduction to Replication
6.2. Replication Implementation Overview
6.3. Row-Based Replication
6.4. Replication Implementation Details
6.4.1. Replication Master Thread States
6.4.2. Replication Slave I/O Thread States
6.4.3. Replication Slave SQL Thread States
6.4.4. Replication Relay and Status Files
6.5. How to Set Up Replication
6.6. Replication Compatibility Between MySQL Versions
6.7. Upgrading a Replication Setup
6.7.1. Upgrading Replication to 5.0
6.8. Replication Features and Known Problems
6.9. Replication Startup Options
6.10. How Servers Evaluate Replication Rules
6.11. Replication FAQ
6.12. Comparison of Statement-Based Versus Row-Based Replication
6.12.1. Troubleshooting Replication
6.12.2. How to Report Replication Bugs or Problems
6.12.3. Auto-Increment in Multi-Master Replication

This chapter describes the various replication features provided by MySQL. It introduces replication concepts, shows how to set up replication servers, and serves as a reference to the available replication options. It also provides a list of frequently asked questions (with answers), and troubleshooting advice for solving replication problems.

For a description of the syntax of replication-related SQL statements, see Section 13.6, “Replication Statements”.

We suggest that you visit our Web site at http://www.mysql.com frequently as well as check for revisions to this chapter. Replication is constantly being improved, and we update the manual regularly with the most current information.

6.1. Introduction to Replication

MySQL features support for one-way, asynchronous replication, in which one server acts as the master, while one or more other servers act as slaves. (This is in contrast to the synchronous replication which is a characteristic of MySQL Cluster — see Chapter 17, MySQL Cluster.) The master server writes updates to its binary log files, and maintains an index of the files to keep track of log rotation. These logs serve as records of updates to be sent to any slave servers. When a slave connects to the master, it informs the master of the position up to which the slave read in the logs at the last successful update. The slave receives any updates that have taken place since that time, and then blocks and waits for the master to notify it of new updates.

A slave server can itself serve as a master if you want to set up chained replication servers.

Note that when you are using replication, all updates to the tables that are replicated should be performed on the master server. Otherwise, you must always be careful to avoid conflicts between updates that users make to tables on the master and updates that they make to tables on the slave. When performing updates on the slaves's side you should also keep in mind that these might work with statement-based replication, but not with row-based replication. Consider the following scenario, where a record is inserted on the slave, followed by a statement on the master's side that should empty the table:

slave> INSERT INTO tbl VALUES (1);
master> DELETE FROM tbl;

The master doesn't know about the INSERT operation on the slave server, but with statement-based replication, tbl will still be empty on both master and slave as soon as the slave catches up with the master, because the slave simply echoes the master's DELETE statement. With row-based replication, however, the master will write to its binlog the rows that were deleted in the master's table, and because it doesn't know about the slave's operations, this will not include the record inserted on the slave. As a consequence, replication will break.

For information about row-based replication (RBR), see Section 6.3, “Row-Based Replication”.

One-way replication has benefits for robustness, speed, and system administration:

  • Robustness is increased with a master/slave setup. In the event of problems with the master, you can switch to the slave as a backup.

  • Better response time for clients can be achieved by splitting the load for processing client queries between the master and slave servers. SELECT queries may be sent to the slave to reduce the query processing load of the master. Statements that modify data should still be sent to the master so that the master and slave do not get out of sync. This load-balancing strategy is effective if non-updating queries dominate, but that is the normal case.

  • Another benefit of using replication is that you can perform backups using a slave server without disturbing the master. The master continues to process updates while the backup is being made. See Section 5.9.1, “Database Backups”.

6.2. Replication Implementation Overview

MySQL replication is based on the master server keeping track of all changes to your databases (updates, deletes, and so on) in the binary logs. Therefore, to use replication, you must enable binary logging on the master server. See Section 5.11.3, “The Binary Log”.

Each slave server receives from the master the saved updates that the master has recorded in its binary log, so that the slave can execute the same updates on its copy of the data.

It is extremely important to realize that the binary log is simply a record starting from the fixed point in time at which you enable binary logging. Any slaves that you set up need copies of the databases on your master as they existed at the moment you enabled binary logging on the master. If you start your slaves with databases that are not in the same state as those on the master when the binary log was started, your slaves are quite likely to fail.

One way to copy the master's data to the slave is to use the LOAD DATA FROM MASTER statement. Be aware that LOAD DATA FROM MASTER currently works only if all the tables on the master use the MyISAM storage engine. In addition, this statement acquires a global read lock, so no updates on the master are possible while the tables are being transferred to the slave. When we implement lock-free hot table backup, this global read lock will no longer be necessary.

Due to these limitations, we recommend that at this point you use LOAD DATA FROM MASTER only if the dataset on the master is relatively small, or if a prolonged read lock on the master is acceptable. Although the actual speed of LOAD DATA FROM MASTER may vary from system to system, a good rule of thumb for how long it takes is 1 second per 1MB of data. This is a rough estimate, but you should find it fairly accurate if both master and slave are equivalent to 700MHz Pentium CPUs in performance and are connected through a 100Mbps network.

After the slave has been set up with a copy of the master's data, it connects to the master and waits for updates to process. If the master fails, or the slave loses connectivity with your master, the slave keeps trying to connect periodically until it is able to resume listening for updates. The retry interval is controlled by the --master-connect-retry option. The default is 60 seconds.

Each slave keeps track of where it left off. The master server has no knowledge of how many slaves there are or of which ones are up to date at any given time.

6.3. Row-Based Replication

With MySQL's classic statement-based replication, there may be issues with replicating stored routines or triggers. You can avoid these issues by using MySQL's row-based replication (RBR) instead. For a detailed list of issues, see Section 20.4, “Binary Logging of Stored Routines and Triggers”.

Row-based replication is available as of MySQL 5.1.5.

If you're building MySQL from source, it must be compiled with the --with-row-based-replication switch to configure in order to enable row-based replication.

Even with row-based replication enabled, MySQL will still default to using statement-based replication. If you want to use row-based replication instead, you have to start the MySQL server with this option:

--binlog-format=row

This enables row-based replication server-wide, and automatically turns on innodb_locks_unsafe_for_binlog as it is safe in this case.

If you want to switch back to statement-based replication, restart the server without the --binlog-format=row option (statement-based replication is the default) or by specifiying --binlog-format=statement explicitly.

Here are two reasons why you would want to set replication logging on a per-connection basis:

  • A thread that does a lot of small changes to the database might want to use row-based logging, while a thread that does a lot of heavy-duty searching might want to use statement-based logging.

  • Some statements require a lot of execution on the master, but create a small result set. It might therefore be beneficial to replicated them row-based.

Row-based replication causes most changes to be written to the binary log using the row-based format. Some changes, however, will still be written into the binary log as statements:

  • ANALYZE

  • REPAIR

  • OPTIMIZE

There is another option you can use with row-based replication: binlog-row-event-max-size. Rows are stored into the binlog in chunks not exceeding this value, which needs to be a multiple of 256. If you don't specify that option, the default value will be 1024.

6.4. Replication Implementation Details

MySQL replication capabilities are implemented using three threads (one on the master server and two on the slave). When a START SLAVE is issued, the slave creates an I/O thread, which connects to the master and asks it to send the statements recorded in its binary logs. The master creates a thread to send the binary log contents to the slave. This thread can be identified as the Binlog Dump thread in the output of SHOW PROCESSLIST on the master. The slave I/O thread reads what the master Binlog Dump thread sends and copies this data to local files, known as relay logs, in the slave's data directory. The third thread is the SQL thread, which the slave creates in order to read the relay logs and to execute the updates they contain.

In the preceding description, there are three threads per slave. A master that has multiple slaves creates one thread for each slave that is currently connected slave; each slave has its own I/O and SQL threads.

Reading of statements and executing them are thus separated into two independent tasks. The task of reading statements is not slowed down if statement execution is slow. For example, if the slave server has not been running for a while, its I/O thread can quickly fetch all the binary log contents from the master when the slave starts, even if the SQL thread lags far behind. If the slave stops before the SQL thread has executed all the fetched statements, the I/O thread has at least fetched everything so that a safe copy of the statements is stored locally in the slave's relay logs, ready for execution the next time that the slave starts. This allows the binary logs to be purged on the master, because it no longer needs to wait for the slave to fetch their contents.

The SHOW PROCESSLIST statement provides information that tells you what is happening on the master and on the slave regarding replication.

The following example illustrates how the three threads show up in SHOW PROCESSLIST.

On the master server, the output from SHOW PROCESSLIST looks like this:

mysql> SHOW PROCESSLIST\G
*************************** 1. row ***************************
     Id: 2
   User: root
   Host: localhost:32931
     db: NULL
Command: Binlog Dump
   Time: 94
  State: Has sent all binlog to slave; waiting for binlog to
         be updated
   Info: NULL

Here, thread 2 is a replication thread for a connected slave. The information indicates that all outstanding updates have been sent to the slave and that the master is waiting for more updates to occur.

On the slave server, the output from SHOW PROCESSLIST looks like this:

mysql> SHOW PROCESSLIST\G
*************************** 1. row ***************************
     Id: 10
   User: system user
   Host:
     db: NULL
Command: Connect
   Time: 11
  State: Waiting for master to send event
   Info: NULL
*************************** 2. row ***************************
     Id: 11
   User: system user
   Host:
     db: NULL
Command: Connect
   Time: 11
  State: Has read all relay log; waiting for the slave I/O
         thread to update it
   Info: NULL

This information indicates that thread 10 is the I/O thread that is communicating with the master server, and thread 11 is the SQL thread that is processing the updates stored in the relay logs. At the time that the SHOW PROCESSLIST was run, both threads were idle, waiting for further updates.

Note that the value in the Time column can show how late the slave is compared to the master. See Section 6.11, “Replication FAQ”.

6.4.1. Replication Master Thread States

The following list shows the most common states you may see in the State column for the master's Binlog Dump thread. If you don't see any Binlog Dump threads on a master server, this means that replication is not running — that is, that no slaves are currently connected.

  • Sending binlog event to slave

    Binary logs consist of events, where an event is usually an update plus some other information. The thread has read an event from the binary log and is now sending it to the slave.

  • Finished reading one binlog; switching to next binlog

    The thread has finished reading a binary log file and is opening the next one to send to the slave.

  • Has sent all binlog to slave; waiting for binlog to be updated

    The thread has read all outstanding updates from the binary logs and sent them to the slave. The thread is now idle, waiting for new events to appear in the binary log resulting from new updates occurring on the master.

  • Waiting to finalize termination

    A very brief state that occurs as the thread is stopping.

6.4.2. Replication Slave I/O Thread States

The following list shows the most common states you see in the State column for a slave server I/O thread. This state also appears in the Slave_IO_State column displayed by SHOW SLAVE STATUS. This means that you can get a good view of what is happening merely by using this statement.

  • Connecting to master

    The thread is attempting to connect to the master.

  • Checking master version

    A state that occurs very briefly, immediately after the connection to the master is established.

  • Registering slave on master

    A state that occurs very briefly immediately after the connection to the master is established.

  • Requesting binlog dump

    A state that occurs very briefly, immediately after the connection to the master is established. The thread sends to the master a request for the contents of its binary logs, starting from the requested binary log filename and position.

  • Waiting to reconnect after a failed binlog dump request

    If the binary log dump request failed (due to disconnection), the thread goes into this state while it sleeps, then tries to reconnect periodically. The interval between retries can be specified using the --master-connect-retry option.

  • Reconnecting after a failed binlog dump request

    The thread is trying to reconnect to the master.

  • Waiting for master to send event

    The thread has connected to the master and is waiting for binary log events to arrive. This can last for a long time if the master is idle. If the wait lasts for slave_read_timeout seconds, a timeout occurs. At that point, the thread considers the connection to be broken and make an attempt to reconnect.

  • Queueing master event to the relay log

    The thread has read an event and is copying it to the relay log so that the SQL thread can process it.

  • Waiting to reconnect after a failed master event read

    An error occurred while reading (due to disconnection). The thread is sleeping for master-connect-retry seconds before attempting to reconnect.

  • Reconnecting after a failed master event read

    The thread is trying to reconnect to the master. When connection is established again, the state becomes Waiting for master to send event.

  • Waiting for the slave SQL thread to free enough relay log space

    You are using a non-zero relay_log_space_limit value, and the relay logs have grown until their combined size exceeds this value. The I/O thread is waiting until the SQL thread frees enough space by processing relay log contents so that it can delete some relay log files.

  • Waiting for slave mutex on exit

    A state that occurs briefly as the thread is stopping.

6.4.3. Replication Slave SQL Thread States

The following list shows the most common states you may see in the State column for a slave server SQL thread:

  • Reading event from the relay log

    The thread has read an event from the relay log so that the event can be processed.

  • Has read all relay log; waiting for the slave I/O thread to update it

    The thread has processed all events in the relay log files, and is now waiting for the I/O thread to write new events to the relay log.

  • Waiting for slave mutex on exit

    A very brief state that occurs as the thread is stopping.

The State column for the I/O thread may also show the text of a statement. This indicates that the thread has read an event from the relay log, extracted the statement from it, and is executing it.

6.4.4. Replication Relay and Status Files

By default, relay logs are named using filenames of the form host_name-relay-bin.nnnnnn, where host_name is the name of the slave server host and nnnnnn is a sequence number. Successive relay log files are created using successive sequence numbers, beginning with 000001. The slave tracks relay logs currently in use in an index file. The default relay log index filename is host_name-relay-bin.index. By default, these files are created in the slave's data directory. The default filenames may be overridden with the --relay-log and --relay-log-index server options. See Section 6.9, “Replication Startup Options”.

Relay logs have the same format as binary logs, and can be read using mysqlbinlog. A relay log is automatically deleted by the SQL thread as soon as it has executed all its events and no longer needs it. There is no explicit mechanism for deleting relay logs because the SQL thread takes care of doing so. However, FLUSH LOGS rotates relay logs, which influences when the SQL thread deletes them.

A new relay log is created under the following conditions:

  • A new relay log is created each time the I/O thread starts.

  • When the logs are flushed; for example, with FLUSH LOGS or mysqladmin flush-logs.

  • When the size of the current relay log file becomes too large. The meaning of “too large” is determined as follows:

    • max_relay_log_size, if max_relay_log_size > 0

    • max_binlog_size, if max_relay_log_size = 0

A slave replication server creates two additional small files in the data directory. These status files are named master.info and relay-log.info by default. They contain information like that shown in the output of the SHOW SLAVE STATUS statement (see Section 13.6.2, “SQL Statements for Controlling Slave Servers”, for a description of this statement). As disk files, they survive a slave server's shutdown. The next time the slave starts up, it reads these files to determine how far it has proceeded in reading binary logs from the master and in processing its own relay logs.

The master.info file is updated by the I/O thread. The correspondence between the lines in the file and the columns displayed by SHOW SLAVE STATUS is as follows:

LineDescription
1Number of lines in the file
2Master_Log_File
3Read_Master_Log_Pos
4Master_Host
5Master_User
6Password (not shown by SHOW SLAVE STATUS)
7Master_Port
8Connect_Retry
9Master_SSL_Allowed
10Master_SSL_CA_File
11Master_SSL_CA_Path
12Master_SSL_Cert
13Master_SSL_Cipher
14Master_SSL_Key

The relay-log.info file is updated by the SQL thread. The correspondence between the lines in the file and the columns displayed by SHOW SLAVE STATUS is shown here:

LineDescription
1Relay_Log_File
2Relay_Log_Pos
3Relay_Master_Log_File
4Exec_Master_Log_Pos

When you back up the slave's data, you should back up these two small files as well, along with the relay log files. They are needed to resume replication after you restore the slave's data. If you lose the relay logs but still have the relay-log.info file, you can check it to determine how far the SQL thread has executed in the master binary logs. Then you can use CHANGE MASTER TO with the MASTER_LOG_FILE and MASTER_LOG_POS options to tell the slave to re-read the binary logs from that point. Of course, this requires that the binary logs still exist on the master server.

If your slave is subject to replicating LOAD DATA INFILE statements, you should also back up any SQL_LOAD-* files that exist in the directory that the slave uses for this purpose. The slave needs these files to resume replication of any interrupted LOAD DATA INFILE operations. The directory location is specified using the --slave-load-tmpdir option. Its default value, if not specified, is the value of the tmpdir variable.

6.5. How to Set Up Replication

Here is a brief description of how to set up complete replication of your current MySQL server. It assumes that you want to replicate all databases on the master and have not previously configured replication. You need to shut down your master server briefly to complete the steps outlined here.

This procedure is written in terms of setting up a single slave, but you can use it to set up multiple slaves.

Although this method is the most straightforward way to set up a slave, it is not the only one. For example, if you have a snapshot of the master's data, and the master has its server ID set and binary logging enabled, you can set up a slave without shutting down the master or even blocking updates to it. For more details, please see Section 6.11, “Replication FAQ”.

If you want to administer a MySQL replication setup, we suggest that you read this entire chapter through and try all statements mentioned in Section 13.6.1, “SQL Statements for Controlling Master Servers”, and Section 13.6.2, “SQL Statements for Controlling Slave Servers”. You should also familiarize yourself with replication startup options described in Section 6.9, “Replication Startup Options”.

Note: this procedure and some of the replication SQL statements shown in later sections require the SUPER privilege.

  1. Make sure that the versions of MySQL installed on the master and slave are compatible according to the table shown in Section 6.6, “Replication Compatibility Between MySQL Versions”. Ideally, you should use the most recent version of MySQL on both master and slave.

    Please do not report bugs until you have verified that the problem is present in the latest MySQL release.

  2. Set up an account on the master server that the slave server can use to connect. This account must be given the REPLICATION SLAVE privilege. If the account is used only for replication (which is recommended), you don't need to grant any additional privileges. (For information about setting up user accounts and privileges, see Section 5.8, “MySQL User Account Management”.)

    Suppose that your domain is mydomain.com and you want to create an account with a username of repl such that slave servers can use the account to access the master server from any host in your domain using a password of slavepass. To create the account, this use GRANT statement:

    mysql> GRANT REPLICATION SLAVE ON *.*
        -> TO 'repl'@'%.mydomain.com' IDENTIFIED BY 'slavepass';
    

    If you plan to use the LOAD TABLE FROM MASTER or LOAD DATA FROM MASTER statements from the slave host, you need to grant this account additional privileges:

    • Grant the account the SUPER and RELOAD global privileges.

    • Grant the SELECT privilege for all tables that you want to load. Any master tables from which the account cannot SELECT are ignored by LOAD DATA FROM MASTER.

  3. Flush all the tables and block write statements by executing a FLUSH TABLES WITH READ LOCK statement:

    mysql> FLUSH TABLES WITH READ LOCK;
    

    For InnoDB tables, note the following: FLUSH TABLES WITH READ LOCK also blocks COMMIT operations. When you have acquired a global read lock, you can start a filesystem snapshot of your InnoDB tables. Internally (inside the InnoDB storage engine) the snapshot won't be consistent (because the InnoDB caches are not flushed), but this is not a cause for concern, because InnoDB resolves this at startup and delivers a consistent result. This means that InnoDB can perform crash recovery when started on this snapshot, without corruption. However, there is no way to stop the MySQL server while insuring a consistent snapshot of your InnoDB tables.

    Leave running the client from which you issue the FLUSH TABLES statement so that the read lock remains in effect. (If you exit the client, the lock is released.) Then take a snapshot of the data on your master server.

    The easiest way to create a snapshot is to use an archiving program to make a binary backup of the databases in your master's data directory. For example, use tar on Unix, or PowerArchiver, WinRAR, WinZip, or any similar software on Windows. To use tar to create an archive that includes all databases, change location into the master server's data directory, then execute this command:

    shell> tar -cvf /tmp/mysql-snapshot.tar .
    

    If you want the archive to include only a database called this_db, use this command instead:

    shell> tar -cvf /tmp/mysql-snapshot.tar ./this_db
    

    Then copy the archive file to the /tmp directory on the slave server host. On that machine, change location into the slave's data directory, and unpack the archive file using this command:

    shell> tar -xvf /tmp/mysql-snapshot.tar
    

    You may not want to replicate the mysql database if the slave server has a different set of user accounts from those that exist on the master. In this case, you should exclude it from the archive. You also need not include any log files in the archive, or the master.info or relay-log.info files.

    While the read lock placed by FLUSH TABLES WITH READ LOCK is in effect, read the value of the current binary log name and offset on the master:

    mysql > SHOW MASTER STATUS;
    +---------------+----------+--------------+------------------+
    | File          | Position | Binlog_Do_DB | Binlog_Ignore_DB |
    +---------------+----------+--------------+------------------+
    | mysql-bin.003 | 73       | test         | manual,mysql     |
    +---------------+----------+--------------+------------------+
    

    The File column shows the name of the log, while Position shows the offset. In this example, the binary log value is mysql-bin.003 and the offset is 73. Record the values. You need to use them later when you are setting up the slave. They represent the replication coordinates at which the slave should begin processing new updates from the master.

    After you have taken the snapshot and recorded the log name and offset, you can re-enable write activity on the master:

    mysql> UNLOCK TABLES;
    

    If you are using InnoDB tables, ideally you should use the InnoDB Hot Backup tool, which takes a consistent snapshot without acquiring any locks on the master server, and records the log name and offset corresponding to the snapshot to be later used on the slave. Hot Backup is an additional non-free (commercial) tool that is not included in the standard MySQL distribution. See the InnoDB Hot Backup home page at http://www.innodb.com/manual.php for detailed information.

    Without the Hot Backup tool, the quickest way to take a binary snapshot of InnoDB tables is to shut down the master server and copy the InnoDB data files, log files, and table format files (.frm files). To record the current log file name and offset, you should issue the following statements before you shut down the server:

    mysql> FLUSH TABLES WITH READ LOCK;
    mysql> SHOW MASTER STATUS;
    

    Then record the log name and the offset from the output of SHOW MASTER STATUS as was shown earlier. After recording the log name and the offset, shut down the server without unlocking the tables to make sure that the server goes down with the snapshot corresponding to the current log file and offset:

    shell> mysqladmin -u root shutdown
    

    An alternative that works for both MyISAM and InnoDB tables is to take an SQL dump of the master instead of a binary copy as described in the preceding discussion. For this, you can use mysqldump --master-data on your master and later load the SQL dump file into your slave. However, this is slower than doing a binary copy.

    If the master has been previously running without --log-bin enabled, the log name and position values displayed by SHOW MASTER STATUS or mysqldump --master-data are empty. In that case, the values that you need to use later when specifying the slave's log file and position are the empty string ('') and 4.

  4. Make sure that the [mysqld] section of the my.cnf file on the master host includes a log-bin option. The section should also have a server-id=master_id option, where master_id must be a positive integer value from 1 to 232 – 1. For example:

    [mysqld]
    log-bin=mysql-bin
    server-id=1
    

    If those options are not present, add them and restart the server.

  5. Stop the server that is to be used as a slave server and add the following to its my.cnf file:

    [mysqld]
    server-id=slave_id
    

    The slave_id value, like the master_id value, must be a positive integer value from 1 to 232 – 1. In addition, it is very important that the ID of the slave be different from the ID of the master. For example:

    [mysqld]
    server-id=2
    

    If you are setting up multiple slaves, each one must have a unique server-id value that differs from that of the master and from each of the other slaves. Think of server-id values as something similar to IP addresses: These IDs uniquely identify each server instance in the community of replication partners.

    If you do not specify a server-id value, it is set to 1 if you have not defined master-host; otherwise it is set to 2. Note that in the case of server-id omission, a master refuses connections from all slaves, and a slave refuses to connect to a master. Thus, omitting server-id is good only for backup with a binary log.

  6. If you made a binary backup of the master server's data, copy it to the slave server's data directory before starting the slave. Make sure that the privileges on the files and directories are correct. The user that the server MySQL runs as must able to read and write the files, just as on the master.

    If you made a backup using mysqldump, start the slave first (see next step).

  7. Start the slave server. If it has been replicating previously, start the slave server with the --skip-slave-start option so that it doesn't immediately try to connect to its master. You also may want to start the slave server with the --log-warnings option (enabled by default), to get more messages in the error log about problems (for example, network or connection problems). Aborted connections are not logged to the error log unless the value is greater than 1.

  8. If you made a backup of the master server's data using mysqldump, load the dump file into the slave server:

    shell> mysql -u root -p < dump_file.sql
    
  9. Execute the following statement on the slave, replacing the option values with the actual values relevant to your system:

    mysql> CHANGE MASTER TO
        ->     MASTER_HOST='master_host_name',
        ->     MASTER_USER='replication_user_name',
        ->     MASTER_PASSWORD='replication_password',
        ->     MASTER_LOG_FILE='recorded_log_file_name',
        ->     MASTER_LOG_POS=recorded_log_position;
    

    The following table shows the maximum length for the string options:

    MASTER_HOST60
    MASTER_USER16
    MASTER_PASSWORD32
    MASTER_LOG_FILE255
  10. Start the slave threads:

    mysql> START SLAVE;
    

After you have performed this procedure, the slave should connect to the master and catch up on any updates that have occurred since the snapshot was taken.

If you have forgotten to set the server-id value for the master, slaves are not able to connect to it.

If you have forgotten to set the server-id value for the slave, you get the following error in the slave's error log:

Warning: You should set server-id to a non-0 value if master_host is set;
we will force server id to 2, but this MySQL server will not act as a slave.

You also find error messages in the slave's error log if it is not able to replicate for any other reason.

Once a slave is replicating, you can find in its data directory one file named master.info and another named relay-log.info. The slave uses these two files to keep track of how much of the master's binary log it has processed. Do not remove or edit these files unless you know exactly what you are doing and fully understand the implications. Even in that case, it is preferred that you use the CHANGE MASTER TO statement.

Note: The content of master.info overrides some of the options specified on the command line or in my.cnf. See Section 6.9, “Replication Startup Options”, for more details.

Once you have a snapshot, you can use it to set up other slaves by following the slave portion of the procedure just described. You do not need to take another snapshot of the master; you can use the same one for each slave.

Note: For the greatest possible durability and consistency in a replication setup using InnoDB with transactions you should use innodb_flush_log_at_trx_commit=1 and sync-binlog=1 in the master my.cnf file.

6.6. Replication Compatibility Between MySQL Versions

The binary log format as implemented in MySQL 5.1 is considerably different than that used in previous versions, especially with regard to handling of character sets, LOAD DATA INFILE, and time zones.

Note: You cannot replicate from a master that uses a newer binary log format to a slave that uses an older format (for example, from MySQL 5.0 to MySQL 4.1.) This has significant consequences for upgrading servers in a replication setup, as described in Section 6.7, “Upgrading a Replication Setup”.

We recommend using the most recent MySQL version available because replication capabilities are continually being improved. We also recommend using the same version for both the master and the slave. We recommend upgrading masters and slaves running alpha or beta versions to new (production) versions. In many cases, replication from a newer master to an older slave will fail. In general, slaves running MySQL 5.1.x can be used with older masters (even those running MySQL 3.23, 4.0, or 4.1), but not the reverse.

The preceding information pertains to replication compatibility at the protocol level. However, there can be other constraints, such as SQL-level compatibility issues. For example, a 5.1 master cannot replicate to a 5.0 slave if the replicated statements use SQL features available in 5.1 but not in 5.0. These and other issues are discussed in Section 6.8, “Replication Features and Known Problems”.

6.7. Upgrading a Replication Setup

When you upgrade servers that participate in a replication setup, the procedure for upgrading depends on the current server versions and the version to which you are upgrading.

6.7.1. Upgrading Replication to 5.0

This section applies to upgrading replication from MySQL 3.23, 4.0, or 4.1 to 5.1. A 4.0 server should be 4.0.3 or newer.

When you upgrade a master to 5.1 from an earlier MySQL release series, you should first ensure that all the slaves of this master are using the same 5.1.x release. If this is not the case, you should first upgrade the slaves. To upgrade each slave, shut it down, upgrade it to the appropriate 5.1.x version, restart it, and restart replication. The 5.1 slave is able to read the old relay logs written prior to the upgrade and to execute the statements they contain. Relay logs created by the slave after the upgrade are in 5.1 format.

After the slaves have been upgraded, shut down the master, upgrade it to the same 5.1.x release as the slaves, and restart it. The 5.1 master is able to read the old binary logs written prior to the upgrade and to send them to the 5.1 slaves. The slaves recognize the old format and handle it properly. Binary logs created by the master following the upgrade are in 5.1 format. These too are recognized by the 5.1 slaves.

In other words, there are no measures to take when upgrading to 5.1, except that the slaves must be 5.1 before you can upgrade the master to 5.1. Note that downgrading from 5.1 to older versions does not work so simply: you must ensure that any 5.1 binary logs or relay logs have been fully processed, so that you can remove them before proceeding with the downgrade.

Note that downgrading a replication setup to a previous version cannot be done once you've switched from statement-based to row-based replication, and after the first row-based statement has been written to the binlog. See Section 6.3, “Row-Based Replication”.

6.8. Replication Features and Known Problems

In general, replication compatibility at the SQL level requires that any features used be supported by both the master and the slave servers. If you use a feature on a master server that is available only as of a given version of MySQL, you cannot replicate to a slave that is older than that version. Such incompatibilities are likely to occur between series, so that, for example, you cannot replicate from MySQL 5.1 to 5.0. However, these incompatibilities also can occur for within-series replication. For example, the SLEEP() function is available in MySQL 5.0.12 and up. If you use this function on the master server, you cannot replicate to a slave server that is older than MySQL 5.0.12.

If you are planning to use replication between 5.1 and a previous version of MySQL you should consult the edition of the MySQL Reference Manual corresponding to the earlier release series for information regarding the replication characteristics of that series.

The following list provides details about what is supported and what is not. Additional InnoDB-specific information about replication is given in Section 15.2.6.5, “InnoDB and MySQL Replication”.

With MySQL's classic statement-based replication, there may be issues with replicating stored routines or triggers. You can avoid these issues by using MySQL's row-based replication (RBR) instead. For a detailed list of issues, see Section 20.4, “Binary Logging of Stored Routines and Triggers”. For a description of row-based replication, see Section 6.3, “Row-Based Replication”.

  • Replication is done correctly with AUTO_INCREMENT, LAST_INSERT_ID(), and TIMESTAMP values.

  • The USER(), UUID(), and LOAD_FILE() functions are replicated without changes and thus do not work reliably on the slave.

  • The following restriction applies to statement-based replication only, not to row-based replication. The functions handling user-level locks: GET_LOCK(), RELEASE_LOCK(), IS_FREE_LOCK(), IS_USED_LOCK() are replicated without the slave knowing the concurrency context on master; so these functions should not be used to insert into a master's table as the content on slave would differ (i.e. do not do INSERT INTO mytable VALUES(GET_LOCK(...))).

  • The FOREIGN_KEY_CHECKS, SQL_MODE, UNIQUE_CHECKS, and SQL_AUTO_IS_NULL variables are all replicated in MySQL 5.1. The TABLE_TYPE, also known as STORAGE_ENGINE variable is not yet replicated, which is a good thing for replication between different storage engines.

  • Replication works even if the master and slave have different global character set variables, and even if the master and slave have different global timezone variables.

  • The following applies to replication between MySQL servers using different character sets:

    1. You must always use the same global character set and collation (--default-character-set, --default-collation) on the master and the slave. Otherwise, you may get duplicate-key errors on the slave, because a key that is regarded as unique in the master's character set may not be unique in the slave's character set.

    2. If the master is older than MySQL 4.1.3, then the character set of the session should never be made different from its global value (in other words, do not use SET NAMES, SET CHARACTER SET, and so on) because this character set change is not known to the slave. If both the master and the slave are 4.1.3 or newer, the session can freely set local values for character set variables (such as NAMES, CHARACTER SET, COLLATION_CLIENT, and COLLATION_SERVER) as these settings are written to the binary log and so known to the slave. However, the session is prevented from changing the global value of these; as stated previously, the master and slave must always have identical global character set values.

    3. If on the master you have databases with different character sets from the global collation_server value, you should design your CREATE TABLE statements so that they do not implicitly rely on the databases' default character sets (Bug #2326); a good workaround is to state the character set and collation explicitly in CREATE TABLE.

  • For both master and slave the same system time zone should be set. Otherwise some statements, for example statements using NOW() or FROM_UNIXTIME() functions, won't be replicated properly. One could set the time zone in which MySQL server runs by using the --timezone=timezone_name option of the mysqld_safe script or by setting the TZ environment variable. Both master and slave should also have the same default connection time zone setting; that is, the --default-time-zone parameter should have the same value for both master and slave.

  • CONVERT_TZ(...,...,@global.time_zone) is not properly replicated. CONVERT_TZ(...,...,@session.time_zone) is properly replicated only if master and slave are 5.0.4 or newer.

  • Session variables are not replicated properly when used in statements which update tables; for example: SET MAX_JOIN_SIZE=1000; INSERT INTO mytable VALUES(@MAX_JOIN_SIZE); will not insert the same data on master and on slave. This does not apply to the common SET TIME_ZONE=...; INSERT INTO mytable VALUES(CONVERT_TZ(...,...,@time_zone)).

  • It is possible to replicate transactional tables on the master using non-transactional tables on the slave. For example, you can replicate an InnoDB master table as a MyISAM slave table. However, if you do this, there are problems if the slave is stopped in the middle of a BEGIN/COMMIT block, because the slave restarts at the beginning of the BEGIN block. This issue is on our TODO and will be fixed in the near future.

  • Update statements that refer to user variables (that is, variables of the form @var_name) are replicated correctly in MySQL 5.1; however this is not true for versions prior to 4.1. Note that user variable names are case insensitive starting in MySQL 5.1; you should take this into account when setting up replication between 5.1 and older versions.

  • Slaves can connect to masters using SSL.

  • There is a global system variable slave_transaction_retries: If the replication slave SQL thread fails to execute a transaction because of an InnoDB deadlock or exceeded InnoDB's innodb_lock_wait_timeout or NDBCluster's TransactionDeadlockDetectionTimeout or TransactionInactiveTimeout, it automatically retries slave_transaction_retries times before stopping with an error. The default value is 10. Starting from MySQL 5.0.4, the total count of retries can be seen in the output of SHOW STATUS; see Section 5.3.4, “Server Status Variables”.

  • If a DATA DIRECTORY or INDEX DIRECTORY clause is used in a CREATE TABLE statement on the master server, the clause is also used on the slave. This can cause problems if no corresponding directory exists in the slave host filesystem or exists but is not accessible to the slave server. MySQL 5.1 supports an sql_mode option called NO_DIR_IN_CREATE. If the slave server is run with its SQL mode set to include this option, it ignores these clauses in replicating the CREATE TABLE statement. The result is that MyISAM data and index files are created in the table's database directory.

  • The following restriction applies to statement-based replication only, not to row-based replication: It is possible for the data on the master and slave to become different if a query is designed in such a way that the data modification is non-deterministic; that is, left to the will of the query optimizer. (This is in general not a good practice, even outside of replication.) For a detailed explanation of this issue, see Section A.8.1, “Open Issues in MySQL”.

  • FLUSH LOGS, FLUSH MASTER, FLUSH SLAVE, and FLUSH TABLES WITH READ LOCK are not logged, as any of these could cause problems if replicated to a slave.) For a syntax example, see Section 13.5.5.2, “FLUSH Syntax”. FLUSH TABLES, ANALYZE TABLE, OPTIMIZE TABLE, and REPAIR TABLE statements are written to the binary log and thus replicated to slaves. This is not normally a problem because these statements do not modify table data. However, this can cause difficulties under certain circumstances. If you replicate the privilege tables in the mysql database and update those tables directly without using GRANT, you must issue a FLUSH PRIVILEGES on the slaves to put the new privileges into effect. In addition, if you use FLUSH TABLES when renaming a MyISAM table that is part of a MERGE table, you must issue FLUSH TABLES manually on the slaves. These statements are written to the binary log unless you specify NO_WRITE_TO_BINLOG or its alias LOCAL.

  • MySQL only supports one master and many slaves. In the future we plan to add a voting algorithm for changing the master automatically in the event of problems with the current master. We also plan to introduce agent processes to help perform load balancing by sending SELECT queries to different slaves.

  • When a server shuts down and restarts, its MEMORY tables become empty. The master replicates this effect as follows: The first time that the master uses each MEMORY table after startup, it notifies the slaves that the table needs to be emptied by writing a DELETE FROM statement for that table to the binary log. See Section 15.4, “The MEMORY (HEAP) Storage Engine”, for more information.

  • Temporary tables are replicated except in the case where you shut down the slave server (not just the slave threads) and you have replicated temporary tables that are used in updates that have not yet been executed on the slave. If you shut down the slave server, the temporary tables needed by those updates are no longer available when the slave is restarted. To avoid this problem, do not shut down the slave while it has temporary tables open. Instead, use the following procedure:

    1. Issue a STOP SLAVE statement.

    2. Use SHOW STATUS to check the value of the Slave_open_temp_tables variable.

    3. If the value is 0, issue a mysqladmin shutdown command to shut down the slave.

    4. If the value is not 0, restart the slave threads with START SLAVE.

    5. Repeat the procedure later to see whether you have better luck next time.

    We plan to fix this problem in the near future.

  • It is safe to connect servers in a circular master/slave relationship with the --log-slave-updates option specified. Note, however, that many statements do not work correctly in this kind of setup unless your client code is written to take care of the potential problems that can occur from updates that occur in different sequence on different servers.

    This means that you can create a setup such as this:

    A -> B -> C -> A
    

    Server IDs are encoded in binary log events, so server A knows when an event that it reads was originally created by itself and does not execute the event (unless server A was started with the --replicate-same-server-id option, which is meaningful only in rare cases). Thus, there are no infinite loops. This type of circular setup works only if you perform no conflicting updates between the tables. In other words, if you insert data in both A and C, you should never insert a row in A that may have a key that conflicts with a row inserted in C. You should also not update the same rows on two servers if the order in which the updates are applied is significant.

  • If a statement on the slave produces an error, the slave SQL thread terminates, and the slave writes a message to its error log. You should then connect to the slave manually, fix the problem (for example, a non-existent table), and then run START SLAVE.

  • It is safe to shut down a master server and restart it later. If a slave loses its connection to the master, the slave tries to reconnect immediately. If that fails, the slave retries periodically. (The default is to retry every 60 seconds. This may be changed with the --master-connect-retry option.) The slave also is able to deal with network connectivity outages. However, the slave does notice the network outage only after receiving no data from the master for slave_net_timeout seconds. If your outages are short, you may want to decrease slave_net_timeout. See Section 5.3.3, “Server System Variables”.

  • Shutting down the slave (cleanly) is also safe, as it keeps track of where it left off. Unclean shutdowns might produce problems, especially if disk cache was not flushed to disk before the system went down. Your system fault tolerance is greatly increased if you have a good uninterruptible power supply. Unclean shutdowns of the master may cause inconsistencies between the content of tables and the binary log in master; this can be avoided by using InnoDB tables and the --innodb-safe-binlog option on the master. See Section 5.11.3, “The Binary Log”. (Note: --innodb-safe-binlog is not needed in MySQL 5.1, having been made obsolete by the introduction of XA transaction support.)

  • Due to the non-transactional nature of MyISAM tables, it is possible to have a statement that only partially updates a table and returns an error code. This can happen, for example, on a multiple-row insert that has one row violating a key constraint, or if a long update statement is killed after updating some of the rows. If that happens on the master, the slave thread exits and waits for the database administrator to decide what to do about it unless the error code is legitimate and the statement execution results in the same error code. If this error code validation behavior is not desirable, some or all errors can be masked out (ignored) with the --slave-skip-errors option.

  • If you update transactional tables from non-transactional tables inside a BEGIN/COMMIT sequence, updates to the binary log may be out of sync if the non-transactional table is updated before the transaction commits. This is because the transaction is written to the binary log only when it is committed.

  • In situations where transactions mix updates to transactional and non-transactional, the order of statements in the binary log is correct, and all needed statements are written to the binary log even in case of a ROLLBACK). However, when a second connection updates the non-transactional table before the first connection's transaction is complete, statements can be logged out of order, because the second connection's update is written immediately after it is performed, regardless of the state of the transaction being performed by the first connection.

6.9. Replication Startup Options

On both the master and the slave, you must use the server-id option to establish a unique replication ID for each server. You should pick a unique positive integer in the range from 1 to 232 – 1 for each master and slave. Example: server-id=3

The options that you can use on the master server for controlling binary logging are described in Section 5.11.3, “The Binary Log”.

The following table describes the options you can use on MySQL 5.1 slave replication servers. You can specify these options either on the command line or in an option file.

Some slave server replication options are handled in a special way, in the sense that they are ignored if a master.info file exists when the slave starts and contains values for the options. The following options are handled this way:

  • --master-host

  • --master-user

  • --master-password

  • --master-port

  • --master-connect-retry

  • --master-ssl

  • --master-ssl-ca

  • --master-ssl-capath

  • --master-ssl-cert

  • --master-ssl-cipher

  • --master-ssl-key

The master.info file format in 5.1 includes values corresponding to the SSL options. In addition, the file format includes as its first line the number of lines in the file. If you upgrade an older server to a newer version, the new server upgrades the master.info file to the new format automatically when it starts. However, if you downgrade a newer server to an older version, you should remove the first line manually before starting the older server for the first time.

If no master.info file exists when the slave server starts, it uses the values for those options that are specified in option files or on the command line. This occurs when you start the server as a replication slave for the very first time, or when you have run RESET SLAVE and then have shut down and restarted the slave.

If the master.info file exists when the slave server starts, the server ignores those options. Instead, it uses the values found in the master.info file.

If you restart the slave server with different values of the startup options that correspond to values in the master.info file, the different values have no effect, because the server continues to use the master.info file. To use different values, you must either restart after removing the master.info file or (preferably) use the CHANGE MASTER TO statement to reset the values while the slave is running.

Suppose that you specify this option in your my.cnf file:

[mysqld]
master-host=some_host

The first time you start the server as a replication slave, it reads and uses that option from the my.cnf file. The server then records the value in the master.info file. The next time you start the server, it reads the master host value from the master.info file only and ignores the value in the option file. If you modify the my.cnf file to specify a different master host of some_other_host, the change still has no effect. You should use CHANGE MASTER TO instead.

Because the server gives an existing master.info file precedence over the startup options just described, you might prefer not to use startup options for these values at all, and instead specify them by using the CHANGE MASTER TO statement. See Section 13.6.2.1, “CHANGE MASTER TO Syntax”.

This example shows a more extensive use of startup options to configure a slave server:

[mysqld]
server-id=2
master-host=db-master.mycompany.com
master-port=3306
master-user=pertinax
master-password=freitag
master-connect-retry=60
report-host=db-slave.mycompany.com

The following list describes startup options for controlling replication: Many of these options can be reset while the server is running by using the CHANGE MASTER TO statement. Others, such as the --replicate-* options, can be set only when the slave server starts. We plan to fix this.

  • --log-slave-updates

    Normally, updates received from a master server by a slave are not logged to its binary log. This option tells the slave to log the updates performed by its SQL thread to the slave's own binary log. For this option to have any effect, the slave must also be started with the --log-bin option to enable binary logging. --log-slave-updates is used when you want to chain replication servers. For example, you might want a setup like this:

    A -> B -> C
    

    That is, A serves as the master for the slave B, and B serves as the master for the slave C. For this to work, B must be both a master and a slave. You must start both A and B with --log-bin to enable binary logging, and B with the --log-slave-updates option.

  • --log-warnings

    Makes the slave print more messages to the error log about what it is doing. For example, it warns you that it succeeded in reconnecting after a network/connection failure, and informs you as to how each slave thread started. This option is enabled by default; to disable it, use --skip-log-warnings. Aborted connections are not logged to the error log unless the value is greater than 1.

    Note that the effects of this option are not limited to replication. It produces warnings across a spectrum of server activities.

  • --master-connect-retry=seconds

    The number of seconds the slave thread sleeps before retrying to connect to the master in case the master goes down or the connection is lost. The value in the master.info file takes precedence if it can be read. If not set, the default is 60.

  • --master-host=host

    The hostname or IP number of the master replication server. If this option is not given, the slave thread does not start. The value in master.info takes precedence if it can be read.

  • --master-info-file=file_name

    The name to use for the file in which the slave records information about the master. The default name is mysql.info in the data directory.

  • --master-password=password

    The password of the account that the slave thread uses for authentication when connecting to the master. The value in the master.info file takes precedence if it can be read. If not set, an empty password is assumed.

  • --master-port=port_number

    The TCP/IP port the master is listening on. The value in the master.info file takes precedence if it can be read. If not set, the compiled-in setting is assumed. If you have not tinkered with configure options, this should be 3306.

  • --master-retry-count=count

    The number of times the slave tries to connect to the master before giving up.

  • --master-ssl, --master-ssl-ca=file_name, --master-ssl-capath=directory_name, --master-ssl-cert=file_name, --master-ssl-cipher=cipher_list, --master-ssl-key=file_name

    These options are used for setting up a secure replication connection to the master server using SSL. Their meanings are the same as the corresponding --ssl, --ssl-ca, --ssl-capath, --ssl-cert, --ssl-cipher, --ssl-key options described in Section 5.8.7.6, “SSL Command-Line Options”. The values in the master.info file take precedence if they can be read.

  • --master-user=username

    The username of the account that the slave thread uses for authentication when connecting to the master. This account must have the REPLICATION SLAVE privilege. The value in the master.info file, if it can be read, takes precedence. If the master user is not set, user test is assumed.

  • --max-relay-log-size=size

    To rotate the relay log automatically. See Section 5.3.3, “Server System Variables”.

  • --read-only

    This option causes the slave not to allow any updates except from slave threads or from users having the SUPER privilege. This can be useful to ensure that a slave server accepts no updates from clients. This option does not apply to TEMPORARY tables.

  • --relay-log=file_name

    The name for the relay log. The default name is host_name-relay-bin.nnnnnn, where host_name is the name of the slave server host and nnnnnn indicates that relay logs are created in numbered sequence. You can specify the option to create hostname-independent relay log names, or if your relay logs tend to be big (and you don't want to decrease max_relay_log_size) and you need to put them in some area different from the data directory, or if you want to increase speed by balancing load between disks.

  • --relay-log-index=file_name

    The location and name that should be used for the relay log index file. The default name is host_name-relay-bin.index, where host_name is the name of the slave server.

  • --relay-log-info-file=file_name

    The name to use for the file in which the slave records information about the relay logs. The default name is relay-log.info in the data directory.

  • --relay-log-purge={0|1}

    Disables or enables automatic purging of relay logs as soon as they are not needed any more. The default value is 1 (enabled). This is a global variable that can be changed dynamically with SET GLOBAL relay_log_purge.

  • --relay-log-space-limit=size

    Places an upper limit on the total size of all relay logs on the slave (a value of 0 means “unlimited”). This is useful for a slave server host that has limited disk space. When the limit is reached, the I/O thread stops reading binary log events from the master server until the SQL thread has caught up and deleted some unused relay logs. Note that this limit is not absolute: There are cases where the SQL thread needs more events before it can delete relay logs. In that case, the I/O thread exceeds the limit until it becomes possible for the SQL thread to delete some relay logs. (Not doing so would cause a deadlock.) You should not set --relay-log-space-limit to less than twice the value of --max-relay-log-size (or --max-binlog-size if --max-relay-log-size is 0). In that case, there is a chance that the I/O thread waits for free space because --relay-log-space-limit is exceeded, but the SQL thread has no relay log to purge and is unable to satisfy the I/O thread. This forces the I/O thread to temporarily ignore --relay-log-space-limit.

  • --replicate-do-db=db_name

    Tells the slave to restrict replication to statements where the default database (that is, the one selected by USE) is db_name. To specify more than one database, use this option multiple times, once for each database. Note that this does not replicate cross-database statements such as UPDATE some_db.some_table SET foo='bar' while having selected a different database or no database. If you need cross-database updates to work, use --replicate-wild-do-table=db_name.%. See Section 6.10, “How Servers Evaluate Replication Rules”.

    An example of what does not work as you might expect: If the slave is started with --replicate-do-db=sales and you issue the following statements on the master, the UPDATE statement is not replicated:

    USE prices;
    UPDATE sales.january SET amount=amount+1000;
    

    If you need cross-database updates to work, use --replicate-wild-do-table=db_name.% instead.

    The main reason for this “just check the default database” behavior is that it is difficult from the statement alone to know whether or not it should be replicated (for example, if you are using multiple-table DELETE statements or multiple-table UPDATE statements that act across multiple databases). It is also faster to check only the default database rather than all databases if there is no need.

  • --replicate-do-table=db_name.tbl_name

    Tells the slave thread to restrict replication to the specified table. To specify more than one table, use this option multiple times, once for each table. This works for cross-database updates, in contrast to --replicate-do-db. See Section 6.10, “How Servers Evaluate Replication Rules”.

  • --replicate-ignore-db=db_name

    Tells the slave to not replicate any statement where the default database (that is, the one selected by USE) is db_name. To specify more than one database to ignore, use this option multiple times, once for each database. You should not use this option if you are using cross-database updates and you do not want these updates to be replicated. See Section 6.10, “How Servers Evaluate Replication Rules”.

    An example of what does not work as you might expect: If the slave is started with --replicate-ignore-db=sales and you issue the following statements on the master, the UPDATE statement is not replicated:

    USE prices;
    UPDATE sales.january SET amount=amount+1000;
    

    If you need cross-database updates to work, use --replicate-wild-ignore-table=db_name.% instead.

  • --replicate-ignore-table=db_name.tbl_name

    Tells the slave thread to not replicate any statement that updates the specified table (even if any other tables might be updated by the same statement). To specify more than one table to ignore, use this option multiple times, once for each table. This works for cross-database updates, in contrast to --replicate-ignore-db. See Section 6.10, “How Servers Evaluate Replication Rules”.

  • --replicate-wild-do-table=db_name.tbl_name

    Tells the slave thread to restrict replication to statements where any of the updated tables match the specified database and table name patterns. Patterns can contain the ‘%’ and ‘_’ wildcard characters, which have the same meaning as for the LIKE pattern-matching operator. To specify more than one table, use this option multiple times, once for each table. This works for cross-database updates. See Section 6.10, “How Servers Evaluate Replication Rules”.

    Example: --replicate-wild-do-table=foo%.bar% replicates only updates that use a table where the database name starts with foo and the table name starts with bar.

    If the table name pattern is %, it matches any table name and the option also applies to database-level statements (CREATE DATABASE, DROP DATABASE, and ALTER DATABASE). For example, if you use --replicate-wild-do-table=foo%.%, database-level statements are replicated if the database name matches the pattern foo%.

    To include literal wildcard characters in the database or table name patterns, escape them with a backslash. For example, to replicate all tables of a database that is named my_own%db, but not replicate tables from the my1ownAABCdb database, you should escape the ‘_’ and ‘%’ characters like this: --replicate-wild-do-table=my\_own\%db. If you're using the option on the command line, you might need to double the backslashes or quote the option value, depending on your command interpreter. For example, with the bash shell, you would need to type --replicate-wild-do-table=my\\_own\\%db.

  • --replicate-wild-ignore-table=db_name.tbl_name

    Tells the slave thread to not replicate a statement where any table matches the given wildcard pattern. To specify more than one table to ignore, use this option multiple times, once for each table. This works for cross-database updates. See Section 6.10, “How Servers Evaluate Replication Rules”.

    Example: --replicate-wild-ignore-table=foo%.bar% does not replicate updates that use a table where the database name starts with foo and the table name starts with bar.

    For information about how matching works, see the description of the --replicate-wild-do-table option. The rules for including literal wildcard characters in the option value are the same as for --replicate-wild-ignore-table as well.

  • --replicate-rewrite-db=from_name->to_name

    Tells the slave to translate the default database (that is, the one selected by USE) to to_name if it was from_name on the master. Only statements involving tables are affected (not statements such as CREATE DATABASE, DROP DATABASE, and ALTER DATABASE), and only if from_name was the default database on the master. This does not work for cross-database updates. Note that the database name translation is done before --replicate-* rules are tested.

    If you use this option on the command line and the ‘>’ character is special to your command interpreter, quote the option value. For example:

    shell> mysqld --replicate-rewrite-db="olddb->newdb"
    
  • --replicate-same-server-id

    To be used on slave servers. Usually you can should the default setting of 0, to prevent infinite loops in circular replication. If set to 1, this slave does not skip events having its own server id; normally this is useful only in rare configurations. Cannot be set to 1 if --log-slave-updates is used. Note that by default the slave I/O thread does not even write binary log events to the relay log if they have the slave's server id (this optimization helps save disk usage). So if you want to use --replicate-same-server-id, be sure to start the slave with this option before you make the slave read its own events which you want the slave SQL thread to execute.

  • --report-host=slave_name

    The hostname or IP number of the slave to be reported to the master during slave registration. This value appears in the output of SHOW SLAVE HOSTS on the master server. Leave the value unset if you do not want the slave to register itself with the master. Note that it is not sufficient for the master to simply read the IP number of the slave from the TCP/IP socket after the slave connects. Due to NAT and other routing issues, that IP may not be valid for connecting to the slave from the master or other hosts.

  • --report-port=slave_port

    The TCP/IP port number for connecting to the slave, to be reported to the master during slave registration. Set it only if the slave is listening on a non-default port or if you have a special tunnel from the master or other clients to the slave. If you are not sure, leave this option unset.

  • --skip-slave-start

    Tells the slave server not to start the slave threads when the server starts. To start the threads later, use a START SLAVE statement.

  • --slave_compressed_protocol={0|1}

    If this option is set to 1, use compression for the slave/master protocol if both the slave and the master support it.

  • --slave-load-tmpdir=file_name

    The name of the directory where the slave creates temporary files. This option is by default equal to the value of the tmpdir system variable. When the slave SQL thread replicates a LOAD DATA INFILE statement, it extracts the to-be-loaded file from the relay log into temporary files, then loads these into the table. If the file loaded on the master was huge, the temporary files on the slave are huge, too. Therefore, it might be advisable to use this option to tell the slave to put temporary files in a directory located in some filesystem that has a lot of available space. In that case, you may also use the --relay-log option to place the relay logs in that filesystem, because the relay logs are huge as well. --slave-load-tmpdir should point to a disk-based filesystem, not a memory-based one: The slave needs the temporary files used to replicate LOAD DATA INFILE to survive a machine's restart. The directory also should not be one that is cleared by the operating system during the system startup process.

  • --slave-net-timeout=seconds

    The number of seconds to wait for more data from the master before aborting the read, considering the connection broken, and trying to reconnect. The first retry occurs immediately after the timeout. The interval between retries is controlled by the --master-connect-retry option.

  • --slave-skip-errors=[err_code1,err_code2,... | all]

    Normally, replication stops when an error occurs, which gives you the opportunity to resolve the inconsistency in the data manually. This option tells the slave SQL thread to continue replication when a statement returns any of the errors listed in the option value.

    Do not use this option unless you fully understand why you are getting errors. If there are no bugs in your replication setup and client programs, and no bugs in MySQL itself, an error that stops replication should never occur. Indiscriminate use of this option results in slaves becoming hopelessly out of sync with the master, with you having no idea why this has occurred.

    For error codes, you should use the numbers provided by the error message in your slave error log and in the output of SHOW SLAVE STATUS. The server error codes are listed in Appendix B, Error Codes and Messages.

    You can also (but should not) use the very non-recommended value of all which ignores all error messages and keeps going regardless of what happens. Needless to say, if you use it, we make no guarantees regarding integrity the integrity of your data. Please do not complain (or file bug reports) in this case if the slave's data is not anywhere close to what it is on the master. You have been warned.

    Examples:

    --slave-skip-errors=1062,1053
    --slave-skip-errors=all
    

6.10. How Servers Evaluate Replication Rules

The slave server evaluates the --replicate-* rules as follows to determine whether to execute or ignore a statement:

  1. Are there any --replicate-do-db or --replicate-ignore-db rules?

    • Yes: Test them as for --binlog-do-db and --binlog-ignore-db (see Section 5.11.3, “The Binary Log”). What is the result of the test?

      • Ignore the statement: Ignore it and exit.

      • Permit the statement: Do not execute the statement immediately. Defer the decision; proceed to the next step.

    • No: Proceed to the next step.

  2. Are we currently executing a stored function?

    • Yes: Execute the query and exit.

    • No: Proceed to the next step.

  3. Are there any --replicate-*-table rules?

    • No: Execute the query and exit.

    • Yes: Proceed to the next step and begin evaluating the table rules in the order shown (first the non-wild rules, and then the wild rules). Only tables that are to be updated are compared to the rules (INSERT INTO sales SELECT * FROM prices: only sales is compared to the rules). If several tables are to be updated (multiple-table statement), the first matching table (matching “do” or “ignore”) wins. That is, the first table is compared to the rules. Then, if no decision could be made, the second table is compared to the rules, and so on.

  4. Are there any --replicate-do-table rules?

    • Yes: Does the table match any of them?

      • Yes: Execute the query and exit.

      • No: Proceed to the next step.

    • No: Proceed to the next step.

  5. Are there any --replicate-ignore-table rules?

    • Yes: Does the table match any of them?

      • Yes: Ignore the query and exit.

      • No: Proceed to the next step.

    • No: Proceed to the next step.

  6. Are there any --replicate-wild-do-table rules?

    • Yes: Does the table match any of them?

      • Yes: Execute the query and exit.

      • No: Proceed to the next step.

    • No: Proceed to the next step.

  7. Are there any --replicate-wild-ignore-table rules?

    • Yes: Does the table match any of them?

      • Yes: Ignore the query and exit.

      • No: Proceed to the next step.

    • No: Proceed to the next step.

  8. No --replicate-*-table rule was matched. Is there another table to test against these rules?

    • Yes: Loop.

    • No: We have now tested all tables to be updated and could not match any rule. Are there --replicate-do-table or --replicate-wild-do-table rules?

      • Yes: There were “do” rules but no match. Ignore the query and exit.

      • No: Execute the query and exit.

6.11. Replication FAQ

Q: How do I configure a slave if the master is running and I do not want to stop it?

A: There are several options. If you have taken a backup of the master at some point and recorded the binary log name and offset (from the output of SHOW MASTER STATUS) corresponding to the snapshot, use the following procedure:

  1. Make sure that the slave is assigned a unique server ID.

  2. Execute the following statement on the slave, filling in appropriate values for each option:

    mysql> CHANGE MASTER TO
        ->     MASTER_HOST='master_host_name',
        ->     MASTER_USER='master_user_name',
        ->     MASTER_PASSWORD='master_pass',
        ->     MASTER_LOG_FILE='recorded_log_file_name',
        ->     MASTER_LOG_POS=recorded_log_position;
    
  3. Execute START SLAVE on the slave.

If you do not have a backup of the master server, here is a quick procedure for creating one. All steps should be performed on the master host.

  1. Issue this statement:

    mysql> FLUSH TABLES WITH READ LOCK;
    
  2. With the lock still in place, execute this command (or a variation of it):

    shell> tar zcf /tmp/backup.tar.gz /var/lib/mysql
    
  3. Issue this statement and make sure to record the output, which you need later:

    mysql> SHOW MASTER STATUS;
    
  4. Release the lock:

    mysql> UNLOCK TABLES;
    

An alternative is to make an SQL dump of the master instead of a binary copy as in the preceding procedure. To do this, you can use mysqldump --master-data on your master and later load the SQL dump into your slave. However, this is slower than making a binary copy.

No matter which of the two methods you use, afterward follow the instructions for the case when you have a snapshot and have recorded the log name and offset. You can use the same snapshot to set up several slaves. Once you have the snapshot of the master, you can wait to set up a slave as long as the binary logs of the master are left intact. The two practical limitations on the length of time you can wait are the amount of disk space available to retain binary logs on the master and the length of time it takes the slave to catch up.

You can also use LOAD DATA FROM MASTER. This is a convenient statement that transfers a snapshot to the slave and adjusts the log name and offset all at once. In the future, LOAD DATA FROM MASTER will be the recommended way to set up a slave. Be warned, however, that it works only for MyISAM tables and it may hold a read lock for a long time. It is not yet implemented as efficiently as we would like. If you have large tables, the preferred method at this time is still to make a binary snapshot on the master server after executing FLUSH TABLES WITH READ LOCK.

Q: Does the slave need to be connected to the master all the time?

A: No, it does not. The slave can go down or stay disconnected for hours or even days, then reconnect and catch up on updates. For example, you can set up a master/slave relationship over a dial-up link where the link is up only sporadically and for short periods of time. The implication of this is that, at any given time, the slave is not guaranteed to be in sync with the master unless you take some special measures. In the future, we will have the option to block the master until at least one slave is in sync.

Q: How do I know how late a slave is compared to the master? In other words, how do I know the date of the last query replicated by the slave?

A: You can read the Seconds_Behind_Master column in SHOW SLAVE STATUS. See Section 6.4, “Replication Implementation Details”.

When the slave SQL thread executes an event read from the master, it modifies its own time to the event timestamp (this is why TIMESTAMP is well replicated). In the Time column in the output of SHOW PROCESSLIST, the number of seconds displayed for the slave SQL thread is the number of seconds between the timestamp of the last replicated event and the real time of the slave machine. You can use this to determine the date of the last replicated event. Note that if your slave has been disconnected from the master for one hour, and then reconnects, you may immediately see Time values like 3600 for the slave SQL thread in SHOW PROCESSLIST. This would be because the slave is executing statements that are one hour old.

Q: How do I force the master to block updates until the slave catches up?

A: Use the following procedure:

  1. On the master, execute these statements:

    mysql> FLUSH TABLES WITH READ LOCK;
    mysql> SHOW MASTER STATUS;
    

    Record the log name and the offset from the output of the SHOW statement. These are the replication coordinates.

  2. On the slave, issue the following statement, where the arguments to the MASTER_POS_WAIT() function are the replication coordinate values obtained in the previous step:

    mysql> SELECT MASTER_POS_WAIT('log_name', log_offset);
    

    The SELECT statement blocks until the slave reaches the specified log file and offset. At that point, the slave is in sync with the master and the statement returns.

  3. On the master, issue the following statement to allow the master to begin processing updates again:

    mysql> UNLOCK TABLES;
    

Q: What issues should I be aware of when setting up two-way replication?

A: MySQL replication currently does not support any locking protocol between master and slave to guarantee the atomicity of a distributed (cross-server) update. In other words, it is possible for client A to make an update to co-master 1, and in the meantime, before it propagates to co-master 2, client B could make an update to co-master 2 that makes the update of client A work differently than it did on co-master 1. Thus, when the update of client A makes it to co-master 2, it produces tables that are different than what you have on co-master 1, even after all the updates from co-master 2 have also propagated. This means that you should not chain two servers together in a two-way replication relationship unless you are sure that your updates can safely happen in any order, or unless you take care of mis-ordered updates somehow in the client code.

You must also realize that two-way replication actually does not improve performance very much (if at all), as far as updates are concerned. Both servers need to do the same number of updates each, as you would have one server do. The only difference is that there is a little less lock contention, because the updates originating on another server are serialized in one slave thread. Even this benefit might be offset by network delays.

Q: How can I use replication to improve performance of my system?

A: You should set up one server as the master and direct all writes to it. Then configure as many slaves as you have the budget and rackspace for, and distribute the reads among the master and the slaves. You can also start the slaves with the --skip-innodb, --skip-bdb, --low-priority-updates, and --delay-key-write=ALL options to get speed improvements on the slave end. In this case, the slave uses non-transactional MyISAM tables instead of InnoDB and BDB tables to get more speed.

Q: What should I do to prepare client code in my own applications to use performance-enhancing replication?

A: If the part of your code that is responsible for database access has been properly abstracted/modularized, converting it to run with a replicated setup should be very smooth and easy. Just change the implementation of your database access to send all writes to the master, and to send reads to either the master or a slave. If your code does not have this level of abstraction, setting up a replicated system gives you the opportunity and motivation to it clean up. You should start by creating a wrapper library or module with the following functions:

  • safe_writer_connect()

  • safe_reader_connect()

  • safe_reader_statement()

  • safe_writer_statement()

safe_ in each function name means that the function takes care of handling all error conditions. You can use different names for the functions. The important thing is to have a unified interface for connecting for reads, connecting for writes, doing a read, and doing a write.

You should then convert your client code to use the wrapper library. This may be a painful and scary process at first, but it pays off in the long run. All applications that use the approach just described are able to take advantage of a master/slave configuration, even one involving multiple slaves. The code is much easier to maintain, and adding troubleshooting options is trivial. You just need to modify one or two functions; for example, to log how long each statement took, or which statement among your many thousands gave you an error.

If you have written a lot of code, you may want to automate the conversion task by using the replace utility that comes with standard MySQL distributions, or write your own conversion script. Ideally, your code uses consistent programming style conventions. If not, then you are probably better off rewriting it anyway, or at least going through and manually regularizing it to use a consistent style.

Q: When and how much can MySQL replication improve the performance of my system?

A: MySQL replication is most beneficial for a system with frequent reads and infrequent writes. In theory, by using a single-master/multiple-slave setup, you can scale the system by adding more slaves until you either run out of network bandwidth, or your update load grows to the point that the master cannot handle it.

In order to determine how many slaves you can get before the added benefits begin to level out, and how much you can improve performance of your site, you need to know your query patterns, and to determine empirically by benchmarking the relationship between the throughput for reads (reads per second, or max_reads) and for writes (max_writes) on a typical master and a typical slave. The example here shows a rather simplified calculation of what you can get with replication for a hypothetical system.

Let's say that system load consists of 10% writes and 90% reads, and we have determined by benchmarking that max_reads is 1200 – 2 × max_writes. In other words, the system can do 1,200 reads per second with no writes, the average write is twice as slow as the average read, and the relationship is linear. Let us suppose that the master and each slave have the same capacity, and that we have one master and N slaves. Then we have for each server (master or slave):

reads = 1200 – 2 × writes

reads = 9 × writes / (N + 1) (reads are split, but writes go to all servers)

9 × writes / (N + 1) + 2 × writes = 1200

writes = 1200 / (2 + 9/(N+1))

The last equation indicates that the maximum number of writes for N slaves, given a maximum possible read rate of 1,200 per minute and a ratio of nine reads per write.

This analysis yields the following conclusions:

  • If N = 0 (which means we have no replication), our system can handle about 1200/11 = 109 writes per second.

  • If N = 1, we get up to 184 writes per second.

  • If N = 8, we get up to 400 writes per second.

  • If N = 17, we get up to 480 writes per second.

  • Eventually, as N approaches infinity (and our budget negative infinity), we can get very close to 600 writes per second, increasing system throughput about 5.5 times. However, with only eight servers, we increase it nearly four times.

Note that these computations assume infinite network bandwidth and neglect several other factors that could turn out to be significant on your system. In many cases, you may not be able to perform a computation similar to the one just shown that accurately predicts what will happen on your system if you add N replication slaves. However, answering the following questions should help you decide if and by how much replication will improve the performance of your system:

  • What is the read/write ratio on your system?

  • How much more write load can one server handle if you reduce the reads?

  • For how many slaves do you have bandwidth available on your network?

Q: How can I use replication to provide redundancy/high availability?

A: With the currently available features, you would have to set up a master and a slave (or several slaves), and to write a script that monitors the master to check whether it is up. Then instruct your applications and the slaves to change master in case of failure. Some suggestions:

  • To tell a slave to change its master, use the CHANGE MASTER TO statement.

  • A good way to keep your applications informed as to the location of the master is by having a dynamic DNS entry for the master. With bind you can use nsupdate to dynamically update your DNS.

  • You should run your slaves with the --log-bin option and without --log-slave-updates. In this way, the slave is ready to become a master as soon as you issue STOP SLAVE; RESET MASTER, and CHANGE MASTER TO on the other slaves. For example, assume that you have the following setup:

           WC
            \
             v
     WC----> M
           / | \
          /  |  \
         v   v   v
        S1   S2  S3
    

    M means the master, S the slaves, WC the clients issuing database writes and reads; clients that issue only database reads are not represented, because they need not switch. S1, S2, and S3 are slaves running with --log-bin and without --log-slave-updates. Because updates received by a slave from the master are not logged in the binary log unless --log-slave-updates is specified, the binary log on each slave is empty. If for some reason M becomes unavailable, you can pick one of the slaves to become the new master. For example, if you pick S1, all WC should be redirected to S1, and S2 and S3 should then replicate from S1.

    Make sure that all slaves have processed any statements in their relay log. On each slave, issue STOP SLAVE IO_THREAD, then check the output of SHOW PROCESSLIST until you see Has read all relay log. When this is true for all slaves, they can be reconfigured to the new setup. On the slave S1 being promoted to become the master, issue STOP SLAVE and RESET MASTER.

    On the other slaves S2 and S3, use STOP SLAVE and CHANGE MASTER TO MASTER_HOST='S1' (where 'S1' represents the real hostname of S1). To CHANGE MASTER, add all information about how to connect to S1 from S2 or S3 (user, password, port). In CHANGE MASTER, there is no need to specify the name of S1's binary log or binary log position to read from: We know it is the first binary log and position 4, which are the defaults for CHANGE MASTER. Finally, use START SLAVE on S2 and S3.

    Then instruct all WC to direct their statements to S1. From that point on, all updates statements sent by WC to S1 are written to the binary log of S1, which then contains every update statement sent to S1 since M died.

    The result is this configuration:

           WC
          /
          |
     WC   |  M(unavailable)
      \   |
       \  |
        v v
         S1<--S2  S3
          ^       |
          +-------+
    

    When M is up again, you must issue on it the same CHANGE MASTER as that issued on S2 and S3, so that M becomes a slave of S1 and picks up all the WC writes that it missed while it was down. To make M a master again (because it is the most powerful machine, for example), use the preceding procedure as if S1 was unavailable and M was to be the new master. During this procedure, do not forget to run RESET MASTER on M before making S1, S2, and S3 slaves of M. Otherwise, they may pick up old WC writes from before the point at which M became unavailable.

Q: How do I tell which format I'm currently running (row-based or statement-based)?

A: By issuing this statement:

mysql> SHOW VARIABLES LIKE "%binlog_format%";

Q: How do I tell the slave to use row-based replication?

A: The slave automatically knows which format it should use.

6.12. Comparison of Statement-Based Versus Row-Based Replication

Advantages of statement-based replication are:

  • Proven technology (existed in MySQL since 3.23).

  • Smaller log files (when using updates or deletes that affects many rows, much smaller log files). Because log files are smaller, they take up less storage space and are faster to back up.

  • Log files contain all statements that made any changes, which allow them to be used to audit the database.

  • Log files can be used for point-in-time recovery, not just for replication purposes. See Section 5.9.3, “Point-in-Time Recovery”.

  • Slave may be a newer version of MySQL with a different row structure.

Disadvantages of statement-based replication are:

  • Not all UPDATE statements can be replicated: Any non-deterministic behavior, for example when using random functions in an SQL statement, is hard to replicate when using statement-based replication. When using a non-deterministic user-defined function (UDF), it is not possible to replicate the result using statement-based replication, while row-based replication will just replicate the value returned by the UDF.

  • Statements that use a UDF (user defined function) which is non-deterministic (value depends on other things than the given parameters) cannot be replicated properly.

  • Statements that use one of the following functions cannot be replicated properly:

    • LOAD_FILE()

    • UUID()

    • USER()

    • FOUND_ROWS()

    All other functions are replicated correctly (including RAND(), NOW(), LOAD DATA INFILE, and so forth).

  • INSERT … SELECT requires more row-level locks than with row-level replication.

  • UPDATE statements that require a table scan (that is don't use indexes in the WHERE clause) have to lock more rows than with row-level replication.

  • For InnoDB: An INSERT statement that uses auto_increment will block other non-conflicting INSERT statements.

  • Slower to apply data on slave for complex queries.

  • Stored functions (not stored procedures) will execute with the same NOW() value as the calling statement. (This may be regarded both as a bad and a good thing.)

  • Deterministic UDFs (user-defined functions) must be applied on the slaves.

  • When getting something wrong on the slave, the difference between master and slave will grow with time.

  • Tables have to be (almost) identical on master and slave.

Advantages of row-level replication are:

  • Everything can be replicated; safest form of replication. Note that currently, DDL (data definition language) statements such as CREATE TABLE are replicated using statement-based replication, while DML (data manipulation language) statements, as well as GRANT and REVOKE statements, are replicated using row-based-replication. For statements like CREATE … SELECT, a CREATE statement is generated from the table definition and replicated statement-based, while the row insertions are replicated row-based.

  • Same technology as most other database management systems (easier to explain to database administrators used to other systems).

  • In many cases, faster to apply data on the slave on tables with primary keys.

  • Less locks needed (thus higher concurrency) on the master for:

    • INSERT … SELECT

    • INSERT statements with auto_increment

    • UPDATE or DELETE statements with WHERE clauses that don't use keys or don't change most of the examined rows.

  • Less locks on the slave for any INSERT, UPDATE, or DELETE statement.

  • It's possible to add multiple threads to apply data on the slave in the future (works better on SMP machines).

Disadvantages of row-level replication are:

  • Bigger log files (much bigger in some cases).

  • Binary log will contain data for large statements that were rolled back.

  • When using row-based replication to replicate a statement (for example, an UPDATE or DELETE statement), each changed row has to be written to the binary log. In contrast, when using statement-based replication only the statement is written to the binary log. If the statement changes a lot of rows, row-based replication may write significantly more data to the binary log. In these cases the binary log will be locked for longer times to write the data, which may cause concurrency problems.

  • Deterministic UDFs that generate big BLOBs will be notably slower to replicate.

  • One can't examine the logs to see what statements were executed.

  • One can't see on the slave what statements were received from the master and executed.

6.12.1. Troubleshooting Replication

If you have followed the instructions, and your replication setup is not working, first check the following:

  • Check the error log for messages. Many users have lost time by not doing this early enough after encountering problems.

  • Is the master logging to the binary log? Check with SHOW MASTER STATUS. If it is, Position is non-zero. If not, verify that you are running the master with the log-bin and server-id options.

  • Is the slave running? Use SHOW SLAVE STATUS to check whether the Slave_IO_Running and Slave_SQL_Running values are both Yes. If not, verify the options that were used when starting the slave server.

  • If the slave is running, did it establish a connection to the master? Use SHOW PROCESSLIST, find the I/O and SQL threads and check their State column to see how they display. See Section 6.4, “Replication Implementation Details”. If the I/O thread state says Connecting to master, verify the privileges for the replication user on the master, master hostname, your DNS setup, whether the master is actually running, and whether it is reachable from the slave.

  • If the slave was running previously but has stopped, the reason usually is that some statement that succeeded on the master failed on the slave. This should never happen if you have taken a proper snapshot of the master, and never modified the data on the slave outside of the slave thread. If it does, it is a bug or you have encountered one of the known replication limitations described in Section 6.8, “Replication Features and Known Problems”. If it is a bug, see Section 6.12.2, “How to Report Replication Bugs or Problems”, for instructions on how to report it.

  • If a statement that succeeded on the master refuses to run on the slave, and it is not feasible to do a full database resynchronization (that is, to delete the slave's database and copy a new snapshot from the master), try the following:

    1. Determine whether the slave's table is different from the master's. Try to understand how this happened. Then make the slave's table identical to the master's and run START SLAVE.

    2. If the preceding step does not work or does not apply, try to understand whether it would be safe to make the update manually (if needed) and then ignore the next statement from the master.

    3. If you decide that you can skip the next statement from the master, issue the following statements:

      mysql> SET GLOBAL SQL_SLAVE_SKIP_COUNTER = n;
      mysql> START SLAVE;
      

      The value of n should be 1 if the next statement from the master does not use AUTO_INCREMENT or LAST_INSERT_ID(). Otherwise, the value should be 2. The reason for using a value of 2 for statements that use AUTO_INCREMENT or LAST_INSERT_ID() is that they take two events in the binary log of the master.

    4. If you are sure that the slave started out perfectly synchronized with the master, and that no one has updated the tables involved outside of the slave thread, then presumably the discrepancy is the result of a bug. If you are running the most recent version, please report the problem. If you are running an older version of MySQL, try upgrading to the latest production release.

6.12.2. How to Report Replication Bugs or Problems

When you have determined that there is no user error involved, and replication still either does not work at all or is unstable, it is time to send us a bug report. We need to obtain as much information as possible from you to be able to track down the bug. Please spend some time and effort in preparing a good bug report.

If you have a repeatable test case that demonstrates the bug, please enter it into our bugs database at http://bugs.mysql.com/. If you have a “phantom” problem (one that you cannot duplicate at will), then use the following procedure:

  1. Verify that no user error is involved. For example, if you update the slave outside of the slave thread, the data goes out of sync, and you can have unique key violations on updates. In this case, the slave thread stops and waits for you to clean up the tables manually to bring them in sync. This is not a replication problem. It is a problem with outside interference causing replication to fail.

  2. Run the slave with the --log-slave-updates and --log-bin options. These options cause the slave to log the updates that it receives from the master into its own binary logs.

  3. Save all evidence before resetting the replication state. If we have no information or only sketchy information, it becomes difficult or impossible for us to track down the problem. The evidence you should collect is:

    • All binary logs from the master

    • All binary logs from the slave

    • The output of SHOW MASTER STATUS from the master at the time you discovered the problem

    • The output of SHOW SLAVE STATUS from the master at the time you discovered the problem

    • Error logs from the master and the slave

  4. Use mysqlbinlog to examine the binary logs. The following should be helpful to find the problem query, for example:

    shell> mysqlbinlog -j pos_from_slave_status \
               /path/to/log_from_slave_status | head
    

Once you have collected the evidence for the problem, try to isolate it as a separate test case first. Then enter the problem into our bugs database at http://bugs.mysql.com/ with as much information as possible.

Q: How do I tell which format I'm currently running (row-based or statement-based)?

A: By issuing this statement:

mysql> SHOW VARIABLES LIKE "%binlog_format%";

Q: How do I tell the slave to use row-based replication?

A: The slave automatically knows which format it should use.

6.12.3. Auto-Increment in Multi-Master Replication

When multiple servers are configured as replication masters, special steps must be taken to prevent key collisions when using auto_increment, otherwise multiple masters may attempt to use the same auto_increment value when inserting rows.

The two server variables auto_increment_increment and auto_increment_offset help to accommodate multi-master replication with AUTO_INCREMENT columns. Each of these variables has a default (and minimum) value of 1, and a maximum value of 65,535.

By setting non-conflicting values for these variables, servers in a multi-master configuration will not use conflicting AUTO_INCREMENT values when inserting new rows into the same table.

These two variables effect AUTO_INCREMENT column behavior as follows:

  • auto_increment_increment controls the interval by which the column value is incremented. For example:

    mysql> SHOW VARIABLES LIKE 'auto_inc%';
    +--------------------------+-------+
    | Variable_name            | Value |
    +--------------------------+-------+
    | auto_increment_increment | 1     |
    | auto_increment_offset    | 1     |
    +--------------------------+-------+
    2 rows in set (0.00 sec)
    
    mysql> CREATE TABLE autoinc1 (col INT NOT NULL AUTO_INCREMENT PRIMARY KEY);
    Query OK, 0 rows affected (0.04 sec)
    
    mysql> SET @auto_increment_increment=10;
    Query OK, 0 rows affected (0.00 sec)
    
    mysql> SHOW VARIABLES LIKE 'auto_inc%';
    +--------------------------+-------+
    | Variable_name            | Value |
    +--------------------------+-------+
    | auto_increment_increment | 10    |
    | auto_increment_offset    | 1     |
    +--------------------------+-------+
    2 rows in set (0.01 sec)
    
    mysql> INSERT INTO autoinc1 VALUES (NULL), (NULL), (NULL), (NULL);
    Query OK, 4 rows affected (0.00 sec)
    Records: 4  Duplicates: 0  Warnings: 0
    
    mysql> SELECT col FROM autoinc1;
    +-----+
    | col |
    +-----+
    |   1 |
    |  11 |
    |  21 |
    |  31 |
    +-----+
    4 rows in set (0.00 sec)
    

    (Note how SHOW VARIABLES is used here to obtain the current values for these variables.)

  • auto_increment_offset determines the starting point for the AUTO_INCREMENT column value. This affects how many masters you can have in your replication setup (i.e. setting this value to 10 means your setup can support up to ten servers).

    Consider the following, assuming that these commands are executed during the same session as the previous example:

    mysql> SET @auto_increment_offset=5;
    Query OK, 0 rows affected (0.00 sec)
    
    mysql> SHOW VARIABLES LIKE 'auto_inc%';
    +--------------------------+-------+
    | Variable_name            | Value |
    +--------------------------+-------+
    | auto_increment_increment | 10    |
    | auto_increment_offset    | 5     |
    +--------------------------+-------+
    2 rows in set (0.00 sec)
    
    mysql> CREATE TABLE autoinc2 (col INT NOT NULL AUTO_INCREMENT PRIMARY KEY);
    Query OK, 0 rows affected (0.06 sec)
    
    mysql> INSERT INTO autoinc2 VALUES (NULL), (NULL), (NULL), (NULL);
    Query OK, 4 rows affected (0.00 sec)
    Records: 4  Duplicates: 0  Warnings: 0
    
    mysql> SELECT col FROM autoinc2;
    +-----+
    | col |
    +-----+
    |   5 |
    |  15 |
    |  25 |
    |  35 |
    +-----+
    4 rows in set (0.02 sec)
    

For additional information see Section 5.3.3, “Server System Variables”.