Sunday, November 11, 2007

PostgreSQL checkpoints

PostgreSQL checkpoints.

In documentation ( http://www.postgresql.org/docs/ ) PostgreSQL checkpoints are described as follows:
checkpoint_segments
Maximum distance between automatic WAL checkpoints, in log file segments (each segment is normally 16 megabytes). The default is three segments. This parameter can only be set in the postgresql.conf file or on the server command line.
checkpoint_timeout
Maximum time between automatic WAL checkpoints, in seconds. The default is five minutes (5min). This parameter can only be set in the postgresql.conf file or on the server command line.
checkpoint_warning
Write a message to the server log if checkpoints caused by the filling of checkpoint segment files happen closer together than this many seconds (which suggests that checkpoint_segments ought to be raised). The default is 30 seconds (30s). Zero disables the warning. This parameter can only be set in the postgresql.conf file or on the server command line.
I shall describe it is more expanded:

When the transaction log no longer has available space, the installation of a so-called control checkpoint takes place. This serves as an order for the system to remove all the unprocessed material to a disk, so the log can become available for use once again. Additionally, the control checkpoint may be installed not upon occasion, but after a certain period of time, typically standing at 5 min. When there is massive logging to the database, the transaction log can become overloaded too rapidly. This, in turn, will lead to a substantial deceleration of the aforementioned material removal to a disk.

Establishing the installation frequency of control checkpoints:

The installation of control checkpoints must take place every couple of minutes. If the installation takes place too frequently (for instance, every minute), the productivity of the system would notably decrease. To establish the current frequency, at which the system installs control checkpoints, you can use the log's analysis of timestamps. First, however, make sure that you've allowed access to the log. Check follow options in the file postgresql.conf:

log_timestamp = true

After that changes the file of a configuration will be automatically re-read, and you can observe in a logfile of server PostgreSQL following records:
2007-02-11 21:17:32 LOG: recycled transaction log file 0000000000000000
2007-02-11 21:17:33 LOG: recycled transaction log file 0000000000000001
2007-02-11 21:17:33 LOG: recycled transaction log file 0000000000000002
Estimate the period between two control points. It will allow you to define their frequency.
In the example resulted above you can see, that control points were made each 40 seconds that is too frequent operation ,
and that slows down productivity of system as a whole. By the way, be not surprised, having found out records with the same
time mark (see above). Frequently the same control point enters the name in the log several times.

Reduction of frequency of installation of Control points

Reduction of frequency of Control points leads to increase in number of files of anticipatory caching, created in data/pg_xlog. Each file has the size of 16 mbyte, that as a result can affect considerably the general free space on a disk. Installation by default means minimization of number of such files. For reduction of frequency of control points you should change following parameter:

checkpoint_segments = 3
Its initial value is equal 3.
Gradually increase this value until the interval between control
points not begins to be equaled to several minutes.
The next record in the logfile , which you can observe, can be like this:
LOG: XLogWrite: new log file created - consider increasing WAL_FILES

It means, that the parameter wal_files requiers increase in a file
postgresql.conf.