Configuring EDB Pgpool-II v4

The configuration files are created in the /etc/sysconfig/edb/pgpool<x.y> directory, where <x.y> is the EDB Pgpool-II release version. By default, .sample is appended to the configuration file name. remove .sample from the configuration file after copying the file to create your custom configuration.

Note

The configuration options for Pgpool-II are extensive. Consider the options listed here as a starting point. For more information about configuring and using Pgpool-II, consult the project website.

Commonly used parameters

The table lists pgpool.conf parameters to use when implementing connection pooling:

Parameter nameDescription
listen_addressesHostname or IP address used by EDB Pgpool-II to listen for connections. The default is localhost. Change to '*' for all addresses.
portPort for EDB Pgpool-II connections. The default is 9999.
pcp_portPort for PCP connections. The default is 9898.
backend_hostname0Hostname or IP address for backend 0. You can specify '' if the backend and EDB Pgpool-II are running on the same host.
backend_port0Port number for backend 0.
backend_weight0Weight for backend 0 (only in load balancing mode). Specify 1 for each backend if you want to balance the load equally or decimal values (.9, .1, etc.) to weigh the load toward specific backends.
backend_data_directory0Data directory for backend 0.
enable_pool_hbaSet to on to use pool_hba.conf for client authentication.
num_init_childrenNumber of pools. Default is 32.
max_poolNumber of connections per pool. Default is 4.
connection_cacheSet to on to enable connection pooling.
pool_conn_dbnameDatabase name to which EDB Pgpool-II connects. By default, EDB Pgpool-II connects with Postgres. As of v4.3, the deprecated pool_conn_dbname parameter is removed.
sr_check_userUser name to perform streaming replication check. Required as of EDB Pgpool-II v4.3.
sr_check_passwordPassword of the sr_check_user user to perform the streaming replication checks. Required as of EDB Pgpool-II v4.3.

The following table lists pgpool.conf parameters to use when implementing replication and load balancing:

Parameter nameDescription
Allow_sql_commentsIf on, ignore SQL comments. Changes to this parameter require reloading the pgpool.conf file.
load_balance_modeSet to on to activate load balancing mode. If load_balance_mode is on and replicate_select is off, SELECT statements are sent to one backend. The parameter backend_weight<N>.z determines the proportion of SELECT statements each backend receives.
ignore_leading_white_spaceIgnore leading white spaces of each query. Certain APIs such as DBI/DBD::Pg for Perl add white space that you can't control. Default is on.

Configuring connection pooling

EDB Pgpool-II provides a set of child processes that maintain cached connections to one or more database servers. When a client connects, EDB Pgpool-II attempts to reuse a connection from its pool, thus avoiding the overhead of opening and closing client connections.

You can reuse a connection in the pool only if the target database and the connection user match a prior connection that is currently in the pool. The pgpool.conf file specifies the connection pooling configuration options (such as the number of child processes and the maximum number of cached connections per child).

To configure connection pooling with one database server:

  1. Configure the pg_hba.conf file on the Pgpool-II host to permit connections between the clients and the server.

  2. Copy the pgpool.conf.sample file to pgpool.conf, modify the file, set the connection_cache parameter to on, and specify connection properties for your database server.

    The following example shows how to connect with the EDB Postgres Advanced Server:

    connection_cache = on
    
    backend_hostname0 = 'localhost'
    
    backend_port0 = 5444
    
    backend_weight0 = 1
    
    backend_data_directory0 = '/var/lib/edb/as14/data'
    
    sr_check_user = 'enterprisedb'
    
    sr_check_password = 'enterprisedb_password'
    

    The following example shows how to connect with the PostgreSQL Server:

    connection_cache = on
    
    backend_hostname0 = 'localhost'
    
    backend_port0 = 5432
    
    backend_weight0 = 1
    
    backend_data_directory0 = '/var/lib/pgsql/14/data'
    
    sr_check_user = 'enterprisedb'
    
    sr_check_password = 'enterprisedb_password'
    
    Note

    In the pgpool.conf file, connection parameters have an appended digit that specifies a cluster node identifier. Database node 0 specifies values for the primary node.

  3. Optionally, configure EDB Pgpool-II client authentication.

  4. Optionally, configure the PCP administrative interface.

  5. Start EDB Pgpool-II:

    systemctl start edb-pgpool-<x.y>.service

    <x.y> is the EDB Pgpool release version.

  6. Run the following platform-specific command to connect to Pgpool43:

    On EDB Postgres Advanced Server for CentOS 7:

    ./psql -d edb -p 9999 -U enterprisedb -h /tmp

    On EDB Postgres Advanced Server for Debian:

    ./psql -d edb -p 9999 -U enterprisedb

    On PostgreSQL Server for CentOS 7:

    ./psql -d postgres -p 9999 -U postgres -h /tmp

    On PostgreSQL Server for Debian:

    ./psql -d postgres -p 9999 -U postgres

Configuring load balancing

EDB supports replication scenarios that use EDB Pgpool-II load balancing with PostgreSQL streaming replication or Slony replication. The supported replication methods ensure that database updates made by client applications apply to multiple backend servers. For detailed information about the benefits of each replication method and configuration instructions, see the project documentation for each utility.

When load balancing is enabled, EDB Pgpool-II distributes some types of SELECT statements to backend servers, allowing multiple database servers and hosts to share the processing load of SELECT statements issued by client applications.

When configuring EDB Pgpool-II load balancing, the initial database environments in all backend servers must be identical:

  • Tables must have the same name, definition, and row content.
  • Schemas must exist in each backend application database.
  • Roles and privileges on each backend server must be configured to ensure the result set of SQL statements are identical on all servers.

If you use password authentication, assign the same password to an associated user name on each database server. Use the same user name/password pair to connect EDB Pgpool-II to each backend connection.

In a replication scenario, each backend is uniquely identified by the hostname (or IP address) and the port number on which the database server instance is listening for connections. Make sure that the pool_hba.conf and pg_hba.conf files allow a connection between that server and the host on which EDB Pgpool-II is running.

The following example shows how to implement EDB Pgpool-II load balancing with two servers (the primary and replica nodes) in a streaming replication scenario. Configuring EDB Pgpool-II load balancing for a Slony replication scenario is similar. See the Slony documentation for information about configuring Slony replication.

Configuring the primary node of the replication scenario

Open an SSH session with the primary node of the replication scenario, and modify the pg_hba.conf file (located in the /var/lib/edb/as14/data directory). Add connection information for the replication user. (In the example that follows, edbrepuser resides on a standby node with an IP address of 107.178.217.178):

host replication edbrepuser 107.178.217.178/32 md5

The connection information must specify the address of the replication scenario's standby node and your preferred authentication method.

Modify the postgresql.conf file (located in /var/lib/edb/as14/data), adding the following replication parameter and values to the end of the file:

 wal_level = replica
 max_wal_senders = 10
 checkpoint_segments = 8
 wal_keep_segments = 0

Save the configuration file, and restart the server:

To restart on RHEL/Rocky Linux/AlmaLinux 8 platforms:

systemctl restart edb-as-14

To restart on the Debian platform:

/usr/edb/as14/bin/epas_ctlcluster 14 main restart

Use the sudo su - command to assume the identity of the enterprisedb database superuser:

 sudo su - enterprisedb

Then, start a psql session, connecting to the edb database:

psql -d edb

At the psql command line, create a user with the replication attribute:

 CREATE ROLE edbrepuser WITH REPLICATION LOGIN PASSWORD 'password';

Configuring the standby node of the replication scenario

Open an SSH session with the standby server and assume the identity of the database superuser (enterprisedb):

sudo su - enterprisedb

With your choice of editor, create a .pgpass file in the home directory of the enterprisedb user. The .pgpass file holds the password of the replication user in plain-text form. If you're using a .pgpass file, make sure that only trusted users have access to it:

Add an entry that specifies connection information for the replication user:

*:5444:*:edbrepuser:password

The server enforces restrictive permissions on the .pgpass file. Use the following command to set the file permissions:

chmod 600 .pgpass

Relinquish the identity of the database superuser:

exit

Then, assume superuser privileges:

sudo su -

Use your platform-specific command to stop the database server before replacing the data directory on the standby node with the data directory of the primary node.

Then, delete the data directory on the standby node:

rm -rf /var/lib/edb/as14/data

After deleting the existing data directory, use the pg_basebackup utility to copy the data directory of the primary node to the standby:

pg_basebackup --format=p --label=standby --host=146.148.46.44 --username=edbrepuser --password --wal-method=stream -R

The call to pg_basebackup specifies the IP address of the primary node and the name of the replication user created on the primary node.

Including the -R option creates the standby.signal file and appends connection settings to postgresql.auto.conf in the output directory (or into the base archive file when using tar format) to ease setting up a standby server.

For more information about the options available with the pg_basebackup utility, see the PostgreSQL core documentation.

When prompted by pg_basebackup, provide the password associated with the replication user.

After copying the data directory, change ownership of the directory to the database superuser (enterprisedb):

chown -R enterprisedb /var/lib/edb/as14/data

Modify the postgresql.conf file (located in /var/lib/edb/as14/data), specifying the following values at the end of the file:

wal_level = replica
hot_standby = on

The data file has been copied from the primary node and contains the replication parameters specified previously.

Then, restart the server. At this point, the primary node is replicating data to the standby node.

Configuring EDB Pgpool-II load balancing

In the pgpool.conf file, modify the parameter settings to specify that load balancing is enabled:

load_balance_mode = on

Then, specify the connections settings for the primary database node in the parameter set that ends with a 0. For example:

backend_hostname0 = '146.148.46.44'
backend_port0 = 5444
backend_weight0 = 1
backend_data_directory0 = '/var/lib/edb/as14/data'

Then, specify the connections settings for each node to which queries are distributed. Increment the number that follows the parameter name for each node, and provide connection details:

backend_hostname1 = '107.178.217.178'
backend_port1 = 5444
backend_weight1 = 1
backend_data_directory1 = '/var/lib/edb/as14/data'

Use the backend_weight parameter to specify how to distribute queries distributed among the nodes. Specify a value of 1 to indicate that you want (qualified) queries to be equally distributed across the nodes of the replication scenario.

Restart EDB Pgpool-II

systemctl restart edb-pgpool-<x.y>.service

<x.y> is the EDB Pgpool release version.

Configuring client authentication

When EDB Pgpool-II is enabled, client applications connect to EDB Pgpool-II, which acts as a middleman for a Postgres server. A connecting client application is first authenticated with the EDB Pgpool-II server and then with the Postgres server.

Parameter settings in the pool_hba.conf configuration file determine the EDB Pgpool-II authentication properties. The pool_hba.conf file is similar in format and function to the Postgres pg_hba.conf configuration file. See the Pgpool-II documentation for detailed information about pool_hba.conf entries.

To enable EDB Pgpool-II authentication:

  1. Copy the pool_hba.conf.sample file to pool_hba.conf.
  2. Modify the pool_hba.conf file, specifying authentication information for servers or users you want to connect. Entries must follow the same format used in the pg_hba.conf file.
  3. Modify the pgpool.conf file, setting the enable_pool_hba parameter to on.
  4. Restart EDB Pgpool-II to reload the EDB Pgpool-II configuration files.
Note

When authenticating with the database server, use the user names and passwords specified in the pool_hba.conf file. You must also specify these user names and passwords in the database server's pg_hba.conf file.

Configuring PCP

PCP is an administrative interface for EDB Pgpool-II that allows you to retrieve information about database nodes, EDB Pgpool-II child processes, and other information. Issue PCP commands from the Linux command line.

pcp.conf is the password configuration file for the PCP client. Before using PCP commands, modify the pcp.conf file, providing the user names and passwords you provide when invoking a PCP command. The user names in the pcp.conf file are independent of the database server user names and passwords.

Use the following steps to configure PCP:

  1. Copy the pcp.conf.sample file to pcp.conf.

  2. Add an entry to the pcp.conf file in the following form:

    username:md5_password

    username is a PCP user name.

    md5_password is the PCP password in md5 format.

    You can use the pg_md5 program to generate the encrypted password from the clear-text form:

    $ pg_md5 mypassword
    
    34819d7beeabb9260a5c854bc85b3e44

    For example, the entry in the pcp.conf file for a PCP user named pcpuser with the password of mypassword is:

    # USERID:MD5PASSWD
    pcpuser:34819d7beeabb9260a5c854bc85b3e44
  3. Restart the EDB Pgpool service.

  4. When issuing a PCP command, specify the PCP user name and the unencrypted form of the password:

    $ pcp_node_info 5 localhost 9898 pcpuser mypassword 0
    localhost 5444 1 1.000000

After configuring PCP, you can use the following PCP commands to control EDB Pgpool-II and retrieve information.

PCP commandDescription
pcp_common_optionsCommon options used in PCP commands
pcp_node_countDisplays the total number of database nodes
pcp_node_infoDisplays the information on the given node ID
pcp_health_check_statsDisplays health check statistics data on given node ID
pcp_watchdog_infoDisplays the watchdog status of the EDB Pgpool-II
pcp_proc_countDisplays the list of EDB Pgpool-II children process IDs
pcp_proc_infoDisplays the information on the given EDB Pgpool-II child process ID
pcp_pool_statusDisplays the parameter values as defined in pgpool.conf
pcp_detach_nodeDetaches the given node from EDB Pgpool-II, forcing existing connections to EDB Pgpool-II to be disconnected
pcp_attach_nodeAttaches the given node to EDB Pgpool-II
pcp_promote_nodePromotes the given node as new main to EDB Pgpool-II
pcp_stop_pgpoolTerminates the EDB Pgpool-II process
pcp_reload_configReloads EDB Pgpool-II config file
pcp_recovery_nodeAttaches the given backend node with recovery
Note

pcp_health_check_stats and pcp_reload_config commands are available from EDB Pgpool version 4.2 onwards.

To view more information about PCP command options, see the Pgpool project site.

Configuring number of connections and pooling

EDB Pgpool-II has some configuration to tune the pooling and connection processing. Depending on this configuration, you must also set the Postgres configuration for max_connections to ensure all connections can be accepted as required. Furthermore, the cloud architecture works with active/active instances, which needs to spread num_init_children over all EDB Pgpool instances (divide the normally used value by the number of active instances).

max_pool: Generally, advised to set max_pool to 1. Alternatively, for applications with many reconnects, you can set max_pool to the number of distinct combinations of users, databases, and connection options for the application connections. All but one connection in the pool are stale connections, which consume a connection slot from Postgres without adding to the performance. We therefore recommend that you don't configure max_pool beyond 4 to preserve a healthy ratio between active and stale connections. As an example, for an application that constantly reconnects and uses two distinct users, both connecting to their own database, set it to 2. If both users can connect to both databases, set it to 4. Increasing max_pool requires that you tune down num_init_children in EDB Pgpool or tune up max_connections in Postgres.

num_init_children: We recommend setting num_init_children to the number of connections that could be running active in parallel, but the divide value by the number of active EDB Pgpool-II instances (one with the on-premise architecture and all instances for the cloud architecture). As an example: in an architecture with three EDB Pgpool instances, to allow the application to have 100 active connections in parallel, set num_init_children to 100 for the on-premises architecture, and set num_init_children to 33 for the cloud architecture. Increasing num_init_children generally requires that you tune up max_connections in Postgres.

listen_backlog_multiplier: Can be set to multiply the number of open connections (as perceived by the application) with the number of active connections (num_init_children). As an example, when the application might open 500 connections, of which 100 should be active in parallel, with the on-premises architecture, set num_init_children to 100 and listen_backlog_multiplier to 4. This setup can process 100 connections active in parallel, and another 400 (listen_backlog_multiplier x num_init_children) connections are queued before connections are blocked. The application perceives a total of 500 open connections, and Postgres processes the load of 100 connections maximum at all times. Increasing listen_backlog_multiplier causes the application to perceive more connections but doesn't increase the number of parallel active connections (which is determined by num_init_children).

max_connections: We recommend setting max_connections in Postgres higher than [number of active pgpool instances] x [max_pool] x [num_init_children] + [superuser_reserved_connections] (Postgres). As an example: in the on-premises setup with three instances active/passive, max_pool set to 2, num_init_children set to 100, and superuser_reserved_connections (Postgres) set to 5, set Postgres max_connections equal to or higher than [1 x 2 x 100+5], which is 205 connections, or higher. A similar setup in the cloud setup runs with three active instances, max_pool set to 2, num_init_children set to 33, and superuser_reserved_connections (Postgres) set to 5. In this case set Postgres max_connections equal or higher than [3x 2 x 33+5] which is 203 or higher. Configuring below the advised setting can cause issues opening new connections and, in combination with max_pool, can cause unexpected behavior (low or no active connections but still connection issues due to stale pooled connections using connection slots from Postgres.

EDB Pgpool-II host setup

After modifying the parameter settings that implement EDB Pgpool-II functionality for your installation, start the EDB Pgpool-II service.

When EDB Pgpool-II starts, it records its process ID in a pgpool.conf file whose name is determined by the pid_file_name configuration parameter. The initial value of the pid_file_name parameter in the sample file is:

pid_file_name = /var/run/edb/pgpool<x.y>/edb-pgpool-<x.y>.pid

<x.y> is the EDB Pgpool release version.

Note

The operating system might remove the contents of the /var/run directory (including the pgpool directory) during a reboot. Don't use the /var/run/edb/pgpool directory as the location for the pgpool.pid file. Modify the pid_file_name parameter to specify a safer directory location.