Quantcast
Channel: VMware Communities : All Content - All Communities
Viewing all articles
Browse latest Browse all 175326

NVMe health monitoring

$
0
0

Hi,

 

i'm using a Samsung 1725b NVMe on ESXi 7.0 and wonder what are people using to

 

- monitor the health (tbw, errrors, temperature)

- predict failures based on these (few) data

 

For a normal SSD, i get a lot of information when using

 

#  esxcli storage core device  smart get -d  ID

Parameter                                                Value  Threshold  Worst  Raw
Health Status                OKN/A  N/AN/A
Media Wearout Indicator      995    99172
Write Error Count            10010   1000
Power-on Hours               920    92151
Power Cycle Count            990    9914
Reallocated Sector Count     10010   1000
Drive Temperature            690    6331
Write Sectors TOT Count      990    9939
Read Sectors TOT Count       990    9940
Initial Bad Block Count      10010   1000
Program Fail Count           10010   1000
Erase Fail Count             10010   1000
Uncorrectable Error Count    1000    1000
Pending Sector Reallocation Count  1000    1000

 

 

For the NVMe i only have this:

 

Parameter                                          Value      Threshold  Worst  Raw
Health Status       OKN/A  N/AN/A
Power-on Hours      1677  N/AN/AN/A
Power Cycle Count   3N/A  N/AN/A
Reallocated Sector Count  090   N/AN/A
Drive Temperature   3679   N/AN/A

 

 

There were some efforts to get smartctl up and running, but everything unofficial.

https://www.virten.net/2016/05/determine-tbw-from-ssds-with-s-m-a-r-t-values-in-esxi-smartctl/

 

Thanks for info.

 

     -Mark


Viewing all articles
Browse latest Browse all 175326

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>