Getting error who fill logs every second

Hello,

I am encountering a problem with pmm, every second the message below fill “pmm-linux-metrics-42000.log” on client server.

time=“2020-02-18T11:12:16+01:00” level=error msg=“error gathering metrics: 4 error(s) occurred:\n*
[from Gatherer #1] collected metric node_textfile_mtime label:<name:“file” value:“example.prom” > gauge:<value:1.579095553e+09 > was collected before with the same name and label values\n*[from Gatherer #1] collected metric node_textfile_scrape_error gauge:<value:0 > was collected before with the same name and label values\n* collected metric node_textfile_mtime label:<name:“file” value:“example.prom” > gauge:<value:1.579095553e+09 > was collected before with the same name and label values\n* collected metric node_textfile_scrape_error gauge:<value:0 > was collected before with the same name and label values\n” source=“log.go:172”

What’s recomandations have you to fix it ?

Thank you in advance :slight_smile:

what version of pmm are you using?

Hello Stateros,

My pmm version server is 1.17.1 and 1.17.3 for client side.

Try to use the same version of client and server. Will be more predictable.

I am having this exact same problem, and I don’t have a discrepancy between versions - 1.17.3 on both the server side and on the client side. Is there a cure? The log size gets unnecessarily big eventually and there is no log rotation by default that ships with the package.

Hi, I think “example.prom” is the sample file for the textfile collector. Are you actually using that sample file to collect data?
If not, have you checked if moving the file out of the textfile collector directory makes the error go away?If you are actually using it, is it possible that your textfile collector is trying to evaluate the file more than once, like maybe via different links to the same file?

@steffen
I removed example.prom file and that turns the message into this:

time="2020-06-17T16:18:08Z" level=error msg="error gathering metrics: collected metric node_textfile_scrape_error gauge:&lt;value:0 &gt;&nbsp; was collected before with the same name and label values\n" source="log.go:172"

This is a 100% vanilla setup, nothing was added or altered away from defaults. It is how pmm-admin created any and all configs.

@gordan
Yes, that’s really not a very helpful message. I’m out of ideas here, since I switched everything to PMM2 already.
Did you try restarting the pmm agent?
From what I see, “node_textfile_scrape_error” indicates whether or not reading a textfile was successful (0) or not (1).
Unfortunately that metric doesn’t name the file(s) the collector has read or tried to read.
The agent process is not by any chance running twice or something like this?

@steffen
Yes, restarting it makes no difference after removing the file.
There is only one instance of node_exporter logging to /var/log/pmm-linux-metrics-42000.log

I also have this problem with 1.17.3. I found that the systemd unit file for pmm-linux-metrics-42000 had textfile listed twice as a collectors.enabled. When i removed that, did a systemctl daemon-reload followed by a restart of pmm-linux-metrics-24000, the problem went away. So does “pmm-admin config” generate a broken unit file for linux?