Memory leak in nginx when upstream servers DNS resolution fails repeatedly (dynamic upstream resolution)

Please use this template for troubleshooting questions.

My issue: Memory leak in nginx when upstream servers DNS resolution fails repeatedly (dynamic upstream resolution)

How I encountered the problem:

nginx worker processes exhibit steady, unbounded RSS memory growth when upstream servers are configured using hostnames and nginx attempts dynamic DNS resolution with resolve;.

This leak only occurs when DNS lookups fail (e.g., NXDOMAIN, SERVFAIL, unreachable DNS), and continues even when no traffic is processed by nginx.

When upstreams are changed to static IPs, memory growth stops completely.
When upstream DNS resolution succeeds, memory remains stable.
This clearly indicates a memory leak in the DNS resolution failure path inside nginx’s dynamic upstream resolver logic.
The issue has been reproduced across multiple platforms and environments, including:
GCP (GKE)
AWS (EKS)
Rocky Linux with standalone nginx 1.25.3
Both embedded and standalone deployments

With multiple numbers of upstreams (leak rate increases with upstream count)
This issue can eventually lead to OOMKilled worker processes or container restarts.

Expected Behavior:

When upstream DNS resolution fails, nginx should:
Retry DNS lookups according to TTL or resolver valid=… settings
Clean up any memory allocations associated with failed DNS queries
Maintain stable RSS memory usage
Not accumulate memory per resolution attempt
No memory leak should occur regardless of DNS success or failure.

Steps to Reproduce the Bug:

  1. Configure nginx upstreams using unresolved DNS hostnames
    Example: See the attached nginx.conf.txt

The config has multiple (around 100) upstreams which are not reachable.

Here nginx is compiled with --with-stream_ssl_module --add-module=/usr/local/src/nginx-upstream-dynamic-servers

sudo /usr/local/nginx/sbin/nginx -V
nginx version: nginx/1.25.3
built by gcc 8.5.0 20210514 (Red Hat 8.5.0-28) (GCC)
built with OpenSSL 1.1.1k  FIPS 25 Mar 2021
TLS SNI support enabled
configure arguments: --prefix=/usr/local/nginx --sbin-path=/usr/local/nginx/sbin/nginx --conf-path=/usr/local/nginx/conf/nginx.conf --pid-path=/usr/local/nginx/logs/nginx.pid --with-http_ssl_module --with-http_v2_module --with-http_gzip_static_module --with-stream --with-stream_ssl_module --add-module=/usr/local/src/nginx-upstream-dynamic-servers

  1. Start nginx with no incoming traffic

  2. Observe nginx worker RSS over time using:

[anand@rocky8-anand1 nginx]$ uname -a
Linux rocky8-anand1.pdlab.local 4.18.0-553.87.1.el8_10.x86_64 #1 SMP Wed Dec 3 12:45:57 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux  

[anand@rocky8-anand1 ~]$ ps aux | grep ngi
gdm         7930  0.0  0.0 439004   544 tty1     Sl   Nov20   0:00 /usr/libexec/ibus-engine-simple                                                                     
root     1579875  0.0  0.2 348312  8388 pts/11   S+   07:34   0:00 sudo /usr/local/nginx/sbin/nginx                                                                    
root     1579877  0.0  0.3  60012 14656 pts/11   S+   07:34   0:00 nginx: master process /usr/local/nginx/sbin/nginx
nobody   1579878  0.1  0.3  71388 13988 pts/11   S+   07:34   0:00 nginx: worker process. <-- memory at 7.34
anand    1579899  0.0  0.0 222016  1112 pts/13   S+   07:34   0:00 grep --color=auto ngi


[anand@rocky8-anand1 ~]$ ps aux | grep ngi
gdm         7930  0.0  0.0 439004   544 tty1     Sl   Nov20   0:00 /usr/libexec/ibus-engine-simple
root     1579875  0.0  0.2 348312  8360 pts/11   S+   07:34   0:00 sudo /usr/local/nginx/sbin/nginx
root     1579877  0.0  0.3  60012 14656 pts/11   S+   07:34   0:00 nginx: master process /usr/local/nginx/sbin/nginx
nobody   1579878  0.2  2.6 156348 98916 pts/11   S+   07:34   0:25 nginx: worker process<-- memory at 10.26
anand    1581869  0.0  0.0 222016  1104 pts/13   S+   10:26   0:00 grep --color=auto ngi

When upstreams are changed to static IPs, memory growth stops completely.
When upstream DNS resolution succeeds, memory remains stable.

Solutions I’ve tried:

Use static IP, instead of FDDN with DNS

Version of NGINX or NGINX adjacent software (e.g. NGINX Gateway Fabric): 1.25.3 and later

Deployment environment:

  • Tried it on 1.25.3 , but should be reproducible on later versions also
  • Target deployment platform: AWS/GCP/local
  • Target OS: RHEL 9
  • Version of this project or specific commit: 1.25.3
  • Version of any relevant project languages: Kubernetes

Minimal NGINX config to reproduce your issue

nginx.conf.txt (144.1 KB)

NGINX access/error log: (Tip → You can usually find the logs in the /var/log/nginx directory.)

Additional Context:

The leak only appears when DNS resolution fails.
When DNS resolution succeeds, memory is stable.
When upstreams use static IPs, memory is stable.
NGINX logs show repeated messages like:
upstream-dynamic-servers: ‘’ could not be resolved (3: Host not found)

Memory leak still occurs even when no requests are proxied, meaning this is triggered by nginx’s background DNS resolution logic—not request handling.

The leak rate increases with the number of upstreams, suggesting per-upstream state is never freed on DNS failures.

This affects production systems only if:
Customers misconfigure upstreams, OR
DNS temporarily fails
A large number of upstream hostnames are configured
Given that this can produce OOM events, this appears to be a critical bug in nginx’s resolver implementation.

Hello Anand.

Thanks for the comprehensive report.

From the text of the error message and the use of --add-module=/usr/local/src/nginx-upstream-dynamic-servers it seems that this memory leak is from the specified 3rd party module.

I recommend either contacting the author of that module or using the nginx built-in DNS resolver.

You can use dynamic DNS resolution with the upstream { server … } directive since 1.27.3

1 Like

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.