Configuring EDB Pgpool-II v4
The configuration files are created in the /etc/sysconfig/edb/pgpool<x.y>
directory, where <x.y> is the EDB Pgpool-II release version. By default, .sample
is appended to the configuration file name. remove .sample
from the configuration file after copying the file to create your custom configuration.
Note
The configuration options for Pgpool-II are extensive. Consider the options listed here as a starting point. For more information about configuring and using Pgpool-II, consult the project website.
Commonly used parameters
The table lists pgpool.conf
parameters to use when implementing connection pooling:
Parameter name | Description |
---|---|
listen_addresses | Hostname or IP address used by EDB Pgpool-II to listen for connections. The default is localhost . Change to '*' for all addresses. |
port | Port for EDB Pgpool-II connections. The default is 9999 . |
pcp_port | Port for PCP connections. The default is 9898 . |
backend_hostname0 | Hostname or IP address for backend 0. You can specify '' if the backend and EDB Pgpool-II are running on the same host. |
backend_port0 | Port number for backend 0. |
backend_weight0 | Weight for backend 0 (only in load balancing mode). Specify 1 for each backend if you want to balance the load equally or decimal values (.9 , .1 , etc.) to weigh the load toward specific backends. |
backend_data_directory0 | Data directory for backend 0. |
enable_pool_hba | Set to on to use pool_hba.conf for client authentication. |
num_init_children | Number of pools. Default is 32 . |
max_pool | Number of connections per pool. Default is 4 . |
connection_cache | Set to on to enable connection pooling. |
pool_conn_dbname | Database name to which EDB Pgpool-II connects. By default, EDB Pgpool-II connects with Postgres. As of v4.3, the deprecated pool_conn_dbname parameter is removed. |
sr_check_user | User name to perform streaming replication check. Required as of EDB Pgpool-II v4.3. |
sr_check_password | Password of the sr_check_user user to perform the streaming replication checks. Required as of EDB Pgpool-II v4.3. |
The following table lists pgpool.conf
parameters to use when implementing replication and load balancing:
Parameter name | Description |
---|---|
Allow_sql_comments | If on, ignore SQL comments. Changes to this parameter require reloading the pgpool.conf file. |
load_balance_mode | Set to on to activate load balancing mode. If load_balance_mode is on and replicate_select is off, SELECT statements are sent to one backend. The parameter backend_weight<N>.z determines the proportion of SELECT statements each backend receives. |
ignore_leading_white_space | Ignore leading white spaces of each query. Certain APIs such as DBI/DBD::Pg for Perl add white space that you can't control. Default is on . |
Configuring connection pooling
EDB Pgpool-II provides a set of child processes that maintain cached connections to one or more database servers. When a client connects, EDB Pgpool-II attempts to reuse a connection from its pool, thus avoiding the overhead of opening and closing client connections.
You can reuse a connection in the pool only if the target database and the connection user match a prior connection that is currently in the pool. The pgpool.conf
file specifies the connection pooling configuration options (such as the number of child processes and the maximum number of cached connections per child).
To configure connection pooling with one database server:
Configure the
pg_hba.conf
file on thePgpool-II
host to permit connections between the clients and the server.Copy the
pgpool.conf.sample
file topgpool.conf
, modify the file, set theconnection_cache
parameter toon
, and specify connection properties for your database server.The following example shows how to connect with the EDB Postgres Advanced Server:
The following example shows how to connect with the PostgreSQL Server:
Note
In the
pgpool.conf
file, connection parameters have an appended digit that specifies a cluster node identifier. Database node0
specifies values for the primary node.Optionally, configure EDB Pgpool-II client authentication.
Optionally, configure the PCP administrative interface.
Start EDB Pgpool-II:
<x.y> is the EDB Pgpool release version.
Run the following platform-specific command to connect to Pgpool43:
On EDB Postgres Advanced Server for CentOS 7:
On EDB Postgres Advanced Server for Debian:
On PostgreSQL Server for CentOS 7:
On PostgreSQL Server for Debian:
Configuring load balancing
EDB supports replication scenarios that use EDB Pgpool-II load balancing with PostgreSQL streaming replication or Slony replication. The supported replication methods ensure that database updates made by client applications apply to multiple backend servers. For detailed information about the benefits of each replication method and configuration instructions, see the project documentation for each utility.
When load balancing is enabled, EDB Pgpool-II distributes some types of SELECT
statements to backend servers, allowing multiple database servers and hosts to share the processing load of SELECT
statements issued by client applications.
When configuring EDB Pgpool-II load balancing, the initial database environments in all backend servers must be identical:
- Tables must have the same name, definition, and row content.
- Schemas must exist in each backend application database.
- Roles and privileges on each backend server must be configured to ensure the result set of SQL statements are identical on all servers.
If you use password
authentication, assign the same password to an associated user name on each database server. Use the same user name/password pair to connect EDB Pgpool-II to each backend connection.
In a replication scenario, each backend is uniquely identified by the hostname (or IP address) and the port number on which the database server instance is listening for connections. Make sure that the pool_hba.conf
and pg_hba.conf
files allow a connection between that server and the host on which EDB Pgpool-II is running.
The following example shows how to implement EDB Pgpool-II load balancing with two servers (the primary and replica nodes) in a streaming replication scenario. Configuring EDB Pgpool-II load balancing for a Slony replication scenario is similar. See the Slony documentation for information about configuring Slony replication.
Configuring the primary node of the replication scenario
Open an SSH session with the primary node of the replication scenario, and modify the pg_hba.conf
file (located in the /var/lib/edb/as14/data
directory). Add connection information for the replication user. (In the example that follows, edbrepuser resides on a standby node with an IP address of 107.178.217.178
):
The connection information must specify the address of the replication scenario's standby node and your preferred authentication method.
Modify the postgresql.conf
file (located in /var/lib/edb/as14/data
), adding the following replication parameter and values to the end of the file:
Save the configuration file, and restart the server:
To restart on RHEL/Rocky Linux/AlmaLinux 8 platforms:
To restart on the Debian platform:
Use the sudo su -
command to assume the identity of the enterprisedb database superuser:
Then, start a psql session, connecting to the edb database:
At the psql command line, create a user with the replication
attribute:
Configuring the standby node of the replication scenario
Open an SSH session with the standby server and assume the identity of the database superuser (enterprisedb):
With your choice of editor, create a .pgpass
file in the home directory of the enterprisedb user. The .pgpass
file holds the password of the replication user in plain-text form. If you're using a .pgpass
file, make sure that only trusted users have access to it:
Add an entry that specifies connection information for the replication user:
The server enforces restrictive permissions on the .pgpass
file. Use the following command to set the file permissions:
Relinquish the identity of the database superuser:
Then, assume superuser privileges:
Use your platform-specific command to stop the database server before replacing the data directory on the standby node with the data
directory of the primary node.
Then, delete the data
directory on the standby node:
After deleting the existing data
directory, use the pg_basebackup
utility to copy the data
directory of the primary node to the standby:
The call to pg_basebackup
specifies the IP address of the primary node and the name of the replication user created on the primary node.
Including the -R
option creates the standby.signal
file and appends connection settings to postgresql.auto.conf
in the output directory (or into the base archive file when using tar format) to ease setting up a standby server.
For more information about the options available with the pg_basebackup
utility, see the PostgreSQL core documentation.
When prompted by pg_basebackup
, provide the password associated with the replication user.
After copying the data
directory, change ownership of the directory to the database superuser (enterprisedb):
Modify the postgresql.conf
file (located in /var/lib/edb/as14/data
), specifying the following values at the end of the file:
The data
file has been copied from the primary node and contains the replication parameters specified previously.
Then, restart the server. At this point, the primary node is replicating data to the standby node.
Configuring EDB Pgpool-II load balancing
In the pgpool.conf
file, modify the parameter settings to specify that load balancing is enabled:
Then, specify the connections settings for the primary database node in the parameter set that ends with a 0
. For example:
Then, specify the connections settings for each node to which queries are distributed. Increment the number that follows the parameter name for each node, and provide connection details:
Use the backend_weight
parameter to specify how to distribute queries distributed among the nodes. Specify a value of 1
to indicate that you want (qualified) queries to be equally distributed across the nodes of the replication scenario.
Restart EDB Pgpool-II
<x.y> is the EDB Pgpool release version.
Configuring client authentication
When EDB Pgpool-II is enabled, client applications connect to EDB Pgpool-II, which acts as a middleman for a Postgres server. A connecting client application is first authenticated with the EDB Pgpool-II server and then with the Postgres server.
Parameter settings in the pool_hba.conf
configuration file determine the EDB Pgpool-II authentication properties. The pool_hba.conf
file is similar in format and function to the Postgres pg_hba.conf
configuration file. See the Pgpool-II documentation for detailed information about pool_hba.conf
entries.
To enable EDB Pgpool-II authentication:
- Copy the
pool_hba.conf.sample
file topool_hba.conf
. - Modify the
pool_hba.conf
file, specifying authentication information for servers or users you want to connect. Entries must follow the same format used in thepg_hba.conf
file. - Modify the
pgpool.conf
file, setting theenable_pool_hba
parameter toon
. - Restart EDB Pgpool-II to reload the EDB Pgpool-II configuration files.
Note
When authenticating with the database server, use the user names and passwords specified in the pool_hba.conf
file. You must also specify these user names and passwords in the database server's pg_hba.conf
file.
Configuring PCP
PCP is an administrative interface for EDB Pgpool-II that allows you to retrieve information about database nodes, EDB Pgpool-II child processes, and other information. Issue PCP commands from the Linux command line.
pcp.conf
is the password configuration file for the PCP client. Before using PCP commands, modify the pcp.conf
file, providing the user names and passwords you provide when invoking a PCP command. The user names in the pcp.conf
file are independent of the database server user names and passwords.
Use the following steps to configure PCP:
Copy the
pcp.conf.sample
file topcp.conf
.Add an entry to the
pcp.conf
file in the following form:username
is a PCP user name.md5_password
is the PCP password inmd5
format.You can use the
pg_md5
program to generate the encrypted password from the clear-text form:For example, the entry in the
pcp.conf
file for a PCP user named pcpuser with the password ofmypassword
is:Restart the EDB Pgpool service.
When issuing a PCP command, specify the PCP user name and the unencrypted form of the password:
After configuring PCP, you can use the following PCP commands to control EDB Pgpool-II and retrieve information.
PCP command | Description |
---|---|
pcp_common_options | Common options used in PCP commands |
pcp_node_count | Displays the total number of database nodes |
pcp_node_info | Displays the information on the given node ID |
pcp_health_check_stats | Displays health check statistics data on given node ID |
pcp_watchdog_info | Displays the watchdog status of the EDB Pgpool-II |
pcp_proc_count | Displays the list of EDB Pgpool-II children process IDs |
pcp_proc_info | Displays the information on the given EDB Pgpool-II child process ID |
pcp_pool_status | Displays the parameter values as defined in pgpool.conf |
pcp_detach_node | Detaches the given node from EDB Pgpool-II, forcing existing connections to EDB Pgpool-II to be disconnected |
pcp_attach_node | Attaches the given node to EDB Pgpool-II |
pcp_promote_node | Promotes the given node as new main to EDB Pgpool-II |
pcp_stop_pgpool | Terminates the EDB Pgpool-II process |
pcp_reload_config | Reloads EDB Pgpool-II config file |
pcp_recovery_node | Attaches the given backend node with recovery |
Note
pcp_health_check_stats
and pcp_reload_config
commands are available from EDB Pgpool version 4.2 onwards.
To view more information about PCP command options, see the Pgpool project site.
Configuring number of connections and pooling
EDB Pgpool-II has some configuration to tune the pooling and connection processing. Depending on this
configuration, you must also set the Postgres configuration for max_connections
to ensure
all connections can be accepted as required. Furthermore, the cloud architecture works
with active/active instances, which needs to spread num_init_children
over all EDB Pgpool
instances (divide the normally used value by the number of active instances).
max_pool: Generally, advised to set max_pool
to 1
. Alternatively, for applications with many reconnects, you can set max_pool
to the number of distinct combinations of users, databases,
and connection options for the application connections. All but one connection in the pool are stale connections, which consume a connection slot from Postgres without adding to the performance.
We therefore recommend that you don't configure max_pool
beyond 4 to preserve a healthy ratio
between active and stale connections. As an example, for an application that constantly reconnects
and uses two distinct users, both connecting to their own database, set it to 2
. If both users
can connect to both databases, set it to 4
. Increasing max_pool
requires
that you tune down num_init_children
in EDB Pgpool or tune up max_connections
in Postgres.
num_init_children: We recommend setting num_init_children
to the number of connections
that could be running active in parallel, but the divide value by the number of active
EDB Pgpool-II instances (one with the on-premise architecture and all instances for the cloud architecture). As an example: in an architecture with three EDB Pgpool instances, to allow the application to
have 100 active connections in parallel, set num_init_children
to 100 for the on-premises architecture, and set num_init_children
to 33 for the cloud architecture. Increasing
num_init_children
generally requires that you tune up max_connections
in Postgres.
listen_backlog_multiplier: Can be set to multiply the number of open connections (as perceived by the application) with the number of active connections (num_init_children
).
As an example, when the application might open 500 connections, of which 100 should be
active in parallel, with the on-premises architecture, set num_init_children
to
100
and listen_backlog_multiplier
to 4
. This setup can process
100 connections active in parallel, and another 400 (listen_backlog_multiplier
x
num_init_children
) connections are queued before connections are blocked. The
application perceives a total of 500 open connections, and Postgres processes the load of
100 connections maximum at all times. Increasing listen_backlog_multiplier
causes the application to perceive more connections but doesn't increase the number of
parallel active connections (which is determined by num_init_children
).
max_connections: We recommend setting max_connections
in Postgres higher
than [number of active pgpool instances] x [max_pool
] x [num_init_children
] +
[superuser_reserved_connections
] (Postgres). As an example: in the on-premises setup
with three instances active/passive, max_pool
set to 2
, num_init_children
set to 100
, and
superuser_reserved_connections
(Postgres) set to 5
, set Postgres max_connections
equal to or higher than [1 x 2 x 100+5], which is 205 connections, or higher. A similar setup in the cloud
setup runs with three active instances, max_pool
set to 2
, num_init_children
set to 33
, and superuser_reserved_connections
(Postgres) set to 5
. In this case
set Postgres max_connections
equal or higher than [3x 2 x 33+5] which is 203 or
higher. Configuring below the advised setting can cause issues opening new connections
and, in combination with max_pool
, can cause unexpected behavior (low or no active connections but still connection issues due to stale pooled connections using connection slots from
Postgres.
EDB Pgpool-II host setup
After modifying the parameter settings that implement EDB Pgpool-II functionality for your installation, start the EDB Pgpool-II service.
When EDB Pgpool-II starts, it records its process ID in a pgpool.conf
file whose name is determined by the pid_file_name
configuration parameter. The initial value of the pid_file_name
parameter in the sample file is:
<x.y> is the EDB Pgpool release version.
Note
The operating system might remove the contents of the /var/run
directory (including the pgpool
directory) during a reboot. Don't use the /var/run/edb/pgpool
directory as the location for the pgpool.pid
file. Modify the pid_file_name
parameter to specify a safer directory location.