Logstash Performance Degradation Memory Leak

Logstash Performance Degradation Memory Leak

I have a couple of windows server data collectors which slowly over time suffer an ingestion degradation as shown above.  Restarting logstash on the server would bring ingrestion rates back up again.

It appears as if it is some form of memory leak so I treated it as such.  I allocated more memory to my data collectors but that didnt change it.

I updated the version of logstash but that also didnt stop the degredation.

I enabled GC logging and uploaded it to a GC log analysis site which reported back as to no issues with my collectors but still the ingestion rates would shrink considerably over time as you can see from the graphs above.

The obvious easy answer is to periodically restart the service to bring it back into line but I dont have full access on the server and ideally I want to keep logstash self contained and not have to couple a windows scheduled task on to keep it running.

 

The Solution

The solution was to utilise the auto config feature in logstash to periodically recycle the services.  So first step is to enable the auto refresh functionality in the logstash.yml file (the below section will have logstash check for config changes every day).

# Periodically check if the configuration has changed and reload the pipeline
# This can also be triggered manually through the SIGHUP signal
#
config.reload.automatic: true
#
# How often to check if the pipeline configuration has changed (in seconds)
# Note that the unit value (s) is required. Values without a qualifier (e.g. 60)
# are treated as nanoseconds.
# Setting the interval this way is not recommended and might change in later versions.
#
config.reload.interval: 86000s

 

Now I have logstash ready I need to have the config change somehow to trigger a restart.  To do this I create an input exec config (I have split my logstash config into seperate module files.  In the below config section it will echo a “input{ }” text string into a test.conf file in the config directory every 3600 seconds in this example.  Logstash will then realiase the config files have changed (test.conf) and perform a restart based on the config.reload.interval setting in logstash.yml.

 

Obviously my input exec interval doesnt need to be 3600 seconds but something more like the config.reload.interval i.e. a day or 86400s.  Since implementing this I no longer have to worry about ingestion dropping off and can worry about other things.