Connect_timeout=1s is too short

Main Issue: frequent connection drops are occurring to certain databases connected to PMM. The drop is happening because the connect timeout value is not long enough for the current setup to establish a connection.

The PostgreSQL logs confirm the issue: PMM is disconnecting too early due to connect_timeout=1.

What the PostgreSQL logs show

  1. SSL negotiation failure:

    • PMM opens a connection and requests SSL
    • PMM disconnects before PostgreSQL can send the SSL response
    • Error: “failed to send SSL negotiation response: Broken pipe”
  2. Data transfer interruption:

    • PMM connects and sends a query
    • PostgreSQL starts sending results
    • PMM disconnects during the transfer
    • Error: “could not send data to client: Broken pipe”

Why this happens

With connect_timeout=1 (1 second):

  • Network latency over Azure Private Link: ~50–200ms
  • SSL handshake: ~100–300ms
  • Total: can exceed 1 second
  • Result: PMM disconnects before the SSL handshake completes

Evidence chain

Source Evidence
PostgreSQL logs SSL negotiation failures, broken pipe errors
PMM logs connect_timeout=1 in connection strings, i/o timeout errors
Azure metrics Byte dips (connections dropping/reconnecting)
Pattern PMM disconnects during SSL handshake and data transfer

Logs

From pmm-agent.log

time="2026-03-05T15:51:57.348+00:00" level=error msg="ts=2026-03-05T15:51:57.348Z caller=datasource.go:107 level=error msg=\"Error opening connection to database\" dsn=\"postgres://***:PASSWORD_REMOVED@10.121.128.53:5432/postgres?connect_timeout=1&sslmode=disable\" err=\"read tcp 10.121.128.49:38450->10.121.128.53:5432: i/o timeout\"" agentID=689c475e-3416-4978-83b1-1b317bd37e76 component=agent-process type=postgres_exporter

From Target DB:

1- "failed to send SSL negotiation response: Broken pipe"

2- user=reader,db=postgres,app=[unknown], client=10.102.89.130LOG: could not send data to client: Broken pipe

Is there any way to increase the connect_timeout value to larger than 1s?

Hi @zoyaaboujaish,

Your diagnosis is correct. PMM hardcodes connect_timeout=1 in the PostgreSQL connection string, and there is currently no user-facing option to change it. With Azure Private Link latency (50-200ms) plus SSL handshake overhead (100-300ms), you can easily exceed the 1-second budget, causing the broken pipe and i/o timeout errors you’re seeing.

I reproduced the failure in a lab by injecting network latency with tc netem on a PMM 3 + PostgreSQL 17 setup. At 500ms latency, pmm-admin add itself fails with i/o timeout and the service is never registered. Lower latencies (50ms, 200ms) work, but 200ms is already borderline, especially with SSL negotiation on top.

The good news is that the PMM team is already tracking this as PMM-12832, which describes the exact same problem (cross-region PostgreSQL monitoring with connect_timeout=1). The ticket is assigned and has recent activity (February 2026). I’d recommend adding yourself as a watcher on that ticket to increase its visibility and get notified when the fix ships.

In the meantime, the only mitigation is to run the PMM Client as close as possible to the monitored database (same region/VPC), which reduces the network round-trip time below the 1-second threshold. This was also the recommendation from the PMM team in the ticket comments.

References: