Table of Contents
Introduction
Proxmox VE hypervisors need to be monitored. Some of the key items you could monitor are:
- Disk I/O Wait time
- CPU temperature
- Smart Status
- Drive temperature
This article aims to provide a foundation for getting SNMP up and running.
By default Proxmox VE won’t have snmp installed, so you can install it so:
apt install snmpd
To make it start automatically, do this:
update-rc.d snmpd enable
For the most part, on our system `pve-manager/8.2.2/9355359cd7afbae4 (running kernel: 6.8.4-3-pve)` the installation went smooth, except this oddball:
adduser: Warning: The home directory `/var/lib/snmp' does not belong to the user you are currently creating. chown: warning: '.' should be ':': ‘Debian-snmp.Debian-snmp’
For now, let’s ignore that error which is Debian SNMP related. When trying to get SNMP working, it’s best to use SNMPWALK from another computer or on the host itself. Installing snmpd
doesn’t install snmpwalk
. You have to do this extra step:
apt install snmp
The default snmpd.conf
has a lot of comments making it hard to understand. So next we’ll back up the original and add our own:
cd /etc/snmp mv snmpd.conf snmpd.conf.backup
Next let’s create a basic SNMP file.
Note: This article isn’t much about SNMP security. If you create an SNMP configuration and you don’t restrict the Internet from reading it, hackers (and competitors) might be able to obtain valuable information about your systems. You have been warned.
cat snmpd.conf rocommunity secret 1.2.3.4 agentAddress udp:161 dontLogTCPWrappersConnects yes # Fix for disks larger than 2TB realStorageUnits 0
One SNMP is working, you should be able to test like this:
snmpwalk -c secret -v1 localhost SNMPv2-MIB::sysDescr.0 SNMPv2-MIB::sysDescr.0 = STRING: Linux server-name 6.8.4-3-pve #1 SMP PREEMPT_DYNAMIC PMX 6.8.4-3 (2024-05-02T11:55Z) x86_64
Next let’s read more interesting parameters:
I/O Wait
snmpwalk -c secret -v1 localhost .1.3.6.1.4.1.2021.11.54.0
Delta counter
iso.3.6.1.4.1.2021.11.54.0 = Counter32: 8833142
Note: This is a high i/o wait!
CPU Temperature
Note you might have two CPUs:
cat /sys/class/thermal/thermal_zone0/temp | sed 's/\(.\)..$/.\1/' cat /sys/class/thermal/thermal_zone1/temp | sed 's/\(.\)..$/.\1/'
To have this temperature displayed continuously you have to do more configuration which is explained here:
https://github.com/in-famous-raccoon/proxmox-snmp
The above article also has information on Smart status, but if you can get it like so using the API:
pvesh get /nodes/server-name/disks/smart --disk /dev/nvme0n1
If you’re not sure if Smart is turned on to a disk, do this:
smartctl -i /dev/sdX
Also check for these events in your log:
2024-05-20T15:56:09.398548+02:00 server smartd[1131]: Device: /dev/bus/0 [megaraid_disk_12] [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 66 to 67
Then do this:
# smartctl --scan /dev/sda -d scsi # /dev/sda, SCSI device /dev/sdb -d scsi # /dev/sdb, SCSI device /dev/sdc -d scsi # /dev/sdc, SCSI device /dev/bus/0 -d megaraid,8 # /dev/bus/0 [megaraid_disk_08], SCSI device /dev/bus/0 -d megaraid,9 # /dev/bus/0 [megaraid_disk_09], SCSI device /dev/bus/0 -d megaraid,10 # /dev/bus/0 [megaraid_disk_10], SCSI device /dev/bus/0 -d megaraid,11 # /dev/bus/0 [megaraid_disk_11], SCSI device /dev/bus/0 -d megaraid,12 # /dev/bus/0 [megaraid_disk_12], SCSI device /dev/bus/0 -d megaraid,13 # /dev/bus/0 [megaraid_disk_13], SCSI device
And this:
# smartctl -i -d megaraid,8 /dev/bus/0 smartctl 7.3 2022-02-28 r5338 [x86_64-linux-6.8.4-3-pve] (local build) Copyright (C) 2002-22, Bruce Allen, Christian Franke, www.smartmontools.org === START OF INFORMATION SECTION === Device Model: HGST HUS726T4TALA6L4 Serial Number: V6G4RTES LU WWN Device Id: 5 000cca 097c22707 Firmware Version: VLGNW40H User Capacity: 4,000,787,030,016 bytes [4.00 TB] Sector Size: 512 bytes logical/physical Rotation Rate: 7200 rpm Form Factor: 3.5 inches Device is: Not in smartctl database 7.3/5319 ATA Version is: ACS-2, ATA8-ACS T13/1699-D revision 4 SATA Version is: SATA 3.2, 6.0 Gb/s (current: 6.0 Gb/s) Local Time is: Mon May 20 16:10:19 2024 SAST SMART support is: Available - device has SMART capability. SMART support is: Enabled
Potential errors
Unknown Object Identifier
You might get this during testing:
snmpwalk -c secret -v1 server.example.com SNMPv2-MIB::sysDescr.0 MIB search path: /root/.snmp/mibs:/usr/share/snmp/mibs:/usr/share/snmp/mibs/iana:/usr/share/snmp/mibs/ietf Cannot find module (SNMPv2-MIB): At line 1 in (none) SNMPv2-MIB::sysDescr.0: Unknown Object Identifier
You have too install snmp-mibs-downloader first. But before you can do that, you have to add this to /etc/apt/sources.list
:
# non-free for snmp-mibs-downloader #deb http://http.us.debian.org/debian stable main contrib non-free
237 !=224
May 20 12:35:25 hv09 snmpd[3807477]: systemstats_linux: unexpected header length in /proc/net/snmp. 237 != 224
E: Package ‘snmp-mibs-downloader’ has no installation candidate
apt install snmp-mibs-downloader ... E: Package 'snmp-mibs-downloader' has no installation candidate
That means you haven’t added non-free repos.