>[оверквотинг удален]
>>>Пасиба, заранее.
>>
>>от файлов оно не зависит
>>по MIB-файлам snmpget преобразует текст в цифровой вид
>>и спрашивает у железки
>>если железка его не знает, то отвечает noSuchName
>
>то есть если я знаю путь в цифровой анотации мне в принципе
>MIB файлы не нужны, и я могу запрос и без них
>сформировать?
Источник
Thuktun (Message)
Zabbix “became not supported” – solved
I think I’ve found one of the answers to a long-annoying Zabbix issue related to SNMP items “flapping” from “became supported” to “became not supported”.
TL;DR – using an SNMPv1 query against an SNMPv2 device will confuse Zabbix. You’ll see intermittent failures of different tests as the device flaps between OK and “unknown”. This can be hard to track down, as its’s not a hard repeatable failure. It’s not the only cause of this error, but fixing this will solve many of the issues.
While looking through our Zabbix server logs I found LOTS of these:
All of these referred to a NetApp in cluster mode, but I found a few similar messages related to some “NetBot” cameras around as well. Additionally, the actual test item varied; there were about 6 different tests that were all failing intermittently. The failing tests were:
- netapp.disk.prefailed.count
- netapp.disk.cfe
- netapp.disk.name
- netapp.disk.version
- netapp.disk.failed.count
- netapp.disk.spare.count
A few Google searches returned some items related to this kind of issue, back to 2013
- https://www.zabbix.com/forum/showthread.php?t=38912 (LenR mentions: “Should the items be zabbix-trapper instead of zabbix-agent? I think I’ve seen this with incorrectly defined zabbix-sender updated items.”
- https://www.zabbix.com/forum/showthread.php?t=22114 (This shows that this has been seen as far back as 2011)
- http://serverfault.com/questions/761645/zabbix-issue-with-lld-lots-of-became-supported-became-not-supported (“should be using Zabbix trapper instead of …)
All of these are talking about Zabbix trapper vs Zabbix agent, that is using the wrong type of check for the test item, but no mention of SNMP.
Let’s look at the Zabbix configuration. Are we using the trapper or the agent for these test items?
Note that the NetApp template doesn’t use the Zabbix trapper or agent, it uses SNMP. But, some tests are SNMPv1 and some are SNMPv2. This is likely due to the fact that some versions of NetApp have had varying support for v1 and v2 over the years, and whoever created the template originally started with just v1. Over the years, as more test items were exposed, new tests were added, but using SNMPv2 and the old tests were left at SNMPv1?
Interesting. All of the failing tests are using SNMPv1. Not all v1 tests are failing, but all failing tests are using v1. There’s nothing here about Zabbix trapper or the Zabbix agent, but there is a (potential) mismatch. This shouldn’t be a problem, but let’s find out.
Over the next few hours, as each failure showed up in the Zabbix logs, I switched that particular test to SNMPv2. After being changed, that test never again “flapped”.
It seems that the keys to solving this were:
- LenR’s comment from 2013 about incorrectly defined items (although he was mentioning the zabbix-sender, not SNMP)
- Realizing it wasn’t a problem with the trapper vs agent, or an incorrect item definition in the agent, but that it was a mismatch in the server’s definition in the test item.
- That SNMPv1 and V2 are being treated differently in the Zabbix server, and that usually doing a v1 test against a v2 device will usually work, but not always.
- The “soft” failure of the v1 test against the v2 device “presents” as a MIB problem (“SNMP error: (noSuchName) There is no such variable name in this MIB.”), not a protocol failure.
I changed all of the failing NetApp tests to SNMPv2 last week. Since then all the tests that were changed from SNMPv1 to SNMPv2 have been fine. There have been none of these errors in the logs for 5 days.
Next: What about those NetBotz? Or maybe Zabbix meets IPv6 🙂
Источник
pmorch / README.md
This is supporting formation for this sourceforget bug report:
SNMP Version 1: If I try to SNMP get two OIDs:
- one of which doesn’t exist
- the other being in numerical .1.3.6.1.4.1.123456.1 format
Then Net-SNMP returns this in $sess :
and replaces the numerical OID with a symbolic one ( enterprises.123456.1.0 in the example).
When I remove the 2nd OID like so:
And try again. Then I get this:
So Net-SNMP replaced my numerical OID with one it doesn’t understand itself? I therefore file a bug.
Here is a some reproducing code along with some sample output.
About the inital OID set
We start out with tese two OIDs:
- .1.3.6.1.4.1.123456.1.0 — An OID for which there is no MIB file
- ifIndex.10000
The ifIndex.10000 is last, because testing shows that the Net-SNMP agent (localhost) will give a noSuchName error with ErrorInd pointing that least OID.
I started out with the first OID being a real valid OID in numerical form. So it doesn’t matter whether there is a mib for the first OID or not.
Use UseLongNames => 1 in the $sess object.
When the OIDs come back with an noSuchName error, replace the OIDs with the ones from the original set. I.e. replace the enterprises. with .1.3.6. And then splice and try again. Then it works.
Linux with libsnmp-perl 5.4.1
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters
ErrorStr (noSuchName): (noSuchName) There is no such variable name in this MIB. Ind: 2 |
$VAR1 = ‘after get’; |
$VAR2 = bless( [ |
bless( [ |
‘enterprises.123456.1.0’, |
», |
», |
‘NULL’ |
], ‘SNMP::Varbind’ ), |
bless( [ |
‘ifIndex’, |
‘10000’, |
», |
‘INTEGER32’ |
], ‘SNMP::Varbind’ ) |
], ‘SNMP::VarList’ ); |
$VAR1 = ‘after removal of first (0th) element’; |
$VAR2 = bless( [ |
bless( [ |
‘enterprises.123456.1.0’, |
», |
», |
‘NULL’ |
], ‘SNMP::Varbind’ ) |
], ‘SNMP::VarList’ ); |
ErrorStr (Sub-id not found): Unknown Object Identifier (Sub-id not found: (top) -> enterprises) |
ErrorStr (Sub-id not found): Unknown Object Identifier (Sub-id not found: (top) -> enterprises) |
In the following two lines, ‘noSuchName’ errors are expected |
ErrorStr (fine): (noSuchName) There is no such variable name in this MIB. |
ErrorStr (fine): (noSuchName) There is no such variable name in this MIB. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters
Источник
Adblock
detector
I think I’ve found one of the answers to a long-annoying Zabbix issue related to SNMP items “flapping” from “became supported” to “became not supported”.
TL;DR – using an SNMPv1 query against an SNMPv2 device will confuse Zabbix. You’ll see intermittent failures of different tests as the device flaps between OK and “unknown”. This can be hard to track down, as its’s not a hard repeatable failure. It’s not the only cause of this error, but fixing this will solve many of the issues.
Details:
While looking through our Zabbix server logs I found LOTS of these:
2031:20161027:111122.224 item "netapp-cluster.thuktun.com:netapp.disk.prefailed.count" became supported 2028:20161027:112119.172 item "netapp-cluster.thuktun.com:netapp.disk.prefailed.count" became not supported: SNMP error: (noSuchName) There is no such variable name in this MIB. 2030:20161027:120146.448 item "netapp-cluster.thuktun.com:netapp.disk.prefailed.count" became supported 2028:20161027:122120.026 item "netapp-cluster.thuktun.com:netapp.disk.prefailed.count" became not supported: SNMP error: (noSuchName) There is no such variable name in this MIB.
All of these referred to a NetApp in cluster mode, but I found a few similar messages related to some “NetBot” cameras around as well. Additionally, the actual test item varied; there were about 6 different tests that were all failing intermittently. The failing tests were:
- netapp.disk.prefailed.count
- netapp.disk.cfe
- netapp.disk.name
- netapp.disk.version
- netapp.disk.failed.count
- netapp.disk.spare.count
A few Google searches returned some items related to this kind of issue, back to 2013
- https://www.zabbix.com/forum/showthread.php?t=38912 (LenR mentions: “Should the items be zabbix-trapper instead of zabbix-agent? I think I’ve seen this with incorrectly defined zabbix-sender updated items.”
- https://www.zabbix.com/forum/showthread.php?t=22114 (This shows that this has been seen as far back as 2011)
- http://serverfault.com/questions/761645/zabbix-issue-with-lld-lots-of-became-supported-became-not-supported (“should be using Zabbix trapper instead of …)
All of these are talking about Zabbix trapper vs Zabbix agent, that is using the wrong type of check for the test item, but no mention of SNMP.
Let’s look at the Zabbix configuration. Are we using the trapper or the agent for these test items?
Note that the NetApp template doesn’t use the Zabbix trapper or agent, it uses SNMP. But, some tests are SNMPv1 and some are SNMPv2. This is likely due to the fact that some versions of NetApp have had varying support for v1 and v2 over the years, and whoever created the template originally started with just v1. Over the years, as more test items were exposed, new tests were added, but using SNMPv2 and the old tests were left at SNMPv1?
Interesting. All of the failing tests are using SNMPv1. Not all v1 tests are failing, but all failing tests are using v1. There’s nothing here about Zabbix trapper or the Zabbix agent, but there is a (potential) mismatch. This shouldn’t be a problem, but let’s find out.
Over the next few hours, as each failure showed up in the Zabbix logs, I switched that particular test to SNMPv2. After being changed, that test never again “flapped”.
It seems that the keys to solving this were:
- LenR’s comment from 2013 about incorrectly defined items (although he was mentioning the zabbix-sender, not SNMP)
- Realizing it wasn’t a problem with the trapper vs agent, or an incorrect item definition in the agent, but that it was a mismatch in the server’s definition in the test item.
- That SNMPv1 and V2 are being treated differently in the Zabbix server, and that usually doing a v1 test against a v2 device will usually work, but not always.
- The “soft” failure of the v1 test against the v2 device “presents” as a MIB problem (“SNMP error: (noSuchName) There is no such variable name in this MIB.”), not a protocol failure.
I changed all of the failing NetApp tests to SNMPv2 last week. Since then all the tests that were changed from SNMPv1 to SNMPv2 have been fine. There have been none of these errors in the logs for 5 days.
Next: What about those NetBotz? Or maybe Zabbix meets IPv6 🙂
This entry was posted on November 15, 2016, 11:00 am and is filed under best practice, monitoring, System Administration. You can follow any responses to this entry through RSS 2.0.
Both comments and pings are currently closed.
Variables seem to disappear when I try to set them. Why?
This is actually the same as the previous question — it just isn’t
particularly obvious, particularly when using SNMPv1. A typical
example of this effect would be
$ snmpget -v1 -c public localhost sysLocation.0 sysLocation.0 = somewhere nearby
$ snmpset -v1 -c public localhost sysLocation.0 s "right here" Error in packet. Reason: (noSuchName) There is no such variable name in this MIB. This name doesn't exist: sysLocation.0
Trying the same request using SNMPv2 or above is somewhat more informative:
$ snmpset -v 2c -c public localhost sysLocation.0 s "right here" Error in packet. Reason: notWritable
The SNMPv1 error 'noSuchName'
actually means: «You can’t do that to this variable» rather than «this variable doesn’t exist».
It may be the case that it doesn’t exist at all. It may exist but you
don’t have access to it (although different administrative credentials
might be accepted). Or it may exist, but you simply can’t perform that
particular operation (e.g. changing it).
Similarly, the SNMPv2 error 'notWritable'
means «not writable in this
particular case» rather than «not writable under any circumstances».
If you are sure that the object is both defined as writable, and has been
implemented as such, then you probably need to look at the agent access control.
See the AGENT section for more details.
But see the next entry first.
FAQ:Applications
- How do I add a MIB?
- How do I add a MIB to the tools?
- Why can’t I see anything from the agent?
- Why doesn’t the agent respond?
- I can see the system group, but nothing else. Why?
- Why can’t I see values in the <ENTERPRISE> tree?
- The agent worked for a while, then stopped responding. Why?
- Requesting an object fails with «Unknown Object Identifier» Why?
- Why do I get «noSuchName» when asking for «sysUpTime» (or similar)?
- Why do I sometimes get «End of MIB» when walking a tree, and sometimes not?
- How do I use SNMPv3?
- Why can’t I set any variables in the MIB?
- Variables seem to disappear when I try to set them. Why?
- Why can’t I change sysLocation (or sysContact)?
- I get an error when trying to set a negative value — why?
- I get an error when trying to query a string-indexed table value — why?
- How should I specify string-index table values?
- How do I send traps and notifications?
- How do I receive traps and notifications?
- How do I receive SNNMPv1 traps?
- Why don’t I receive incoming traps?
- My traphandler script doesn’t work when run like this — why not?
- How can the agent receive traps and notifications?
- How big can an SNMP request (or reply) be?
- How can I monitor my systems (disk, memory, etc)?
- Applications complain about entries in your example ‘snmp.conf’ file. Why?
- OK, what should I put in snmp.conf?
- How do I specify IPv6 addresses in tools command line arguments?