You can use the vscsiStats tool to gather storage performance data for VMFS and NFS datastores. The tool is now available on a default install of ESXi, and is located in the /usr/sbin directory. Running the tool, without any arguments, will display the usage options:
~ # vscsiStats VscsiStats -- This tool controls vscsi data collection for virtual machine disk disk I/O workload characterization. Users can start and stop online data data collection for particular VMs as well as print out online histogram data. Command traces can also be collected and printed. The following histogram related options are available: -h, --help will print the usage -l, --list will list the available virtual machines and their virtual disks -r, --reset will reset the stats -s, --start will start vscsiStats collection; exclusive of -x -x, --stop will stop vscsiStats collection; exclusive of -s -w , --worldgroupid specifies a worldID to use for this operation -i , --handleid specifies a vscsi handleID to use for this operation requires the -w option -p , --printhistos will print out the current histograms for the specified histogram type. May be used in conjunction with -w and -i. histoType must be one and only one of: all, ioLength, seekDistance, outstandingIOs, latency, interarrival -c, --csv will use comma as delimiter in conjunction with -p The following command trace related options are available: -t, --tracecmds will start scsi cmd traces; in conjunction with -s Note:- the -t option consumes significant system resources so enabling it indefinitely is not advisable - try to limit the #virtual disk for which cmd tracing is enabled at any given time by using --worldgroupid and/or --handleid options. - trace contains NO customer sensitive data - only information recorded is: - serialnumber, IO block size, number of scatter-gather elements - command type, block number, timestamp - Therefore, actual data/payload of commands is not stored - If successfully started, log channel id(s) will be printed out. To store the command trace in a file for later processing, invoke: $ logchannellogger <log_channel_id> <binary_trace_file> -e , --traceprettyprint reads in a vscsi cmd trace from the given filename and sends a CSV formatted output to stdout; exclusive of all other options vscsiStats Usage: vscsiStats [options]
The first step to collecting data is to get the world id for a virtual machine you wish to monitor. You can list the world ids by running ‘vscsiStats -l’, as shown below:
~ # vscsiStats -l Virtual Machine worldGroupID: 5599, Virtual Machine Display Name: TestVM04, Virtual Machine Config File: /vmfs/volumes/5221687f-72547f8e-a71d-005056af71fa/TestVM04/TestVM04.vmx, { Virtual SCSI Disk handleID: 8192 (ide0:0) } Virtual Machine worldGroupID: 5890, Virtual Machine Display Name: TestVM06, Virtual Machine Config File: /vmfs/volumes/5221687f-72547f8e-a71d-005056af71fa/TestVM06/TestVM06.vmx, { Virtual SCSI Disk handleID: 8193 (ide0:0) }
In this example we can see there are two running virtual machines on this host, each with one virtual disk. We can start a collection on the first VMs disk by running:
~ # vscsiStats -s -w 5599 vscsiStats: Starting Vscsi stats collection for worldGroup 5599, handleID 8192 (ide0:0) Success.
The collection will run for 30 minutes unless it is stopped before then. It can be stopped by running:
~ # vscsiStats -x -w 5599 vscsiStats: Stopping all Vscsi stats collection for worldGroup 5599, handleID 8192 (ide0:0) Success.
If you wanted to extend the collection time you can run the start command again whilst the collection is already running:
~ # vscsiStats -s -w 5599 vscsiStats: Starting Vscsi stats collection for worldGroup 5599, handleID 8192 (ide0:0) Success.
To reset the statistics without stopping the collection run:
~ # vscsiStats -r -w 5599 vscsiStats: Resetting Vscsi stats for worldGroup 5599, handleID 8192 (ide0:0) Success.
So, that’s how to start, stop and reset the collection process. To view the data, at any point whilst the collection is running, use the -p switch:
~ # vscsiStats -p all | less Histogram: latency of IO interarrival time for Writes in Microseconds (us) for virtual machine worldGroupID : 5599, virtual disk handleID : 8194 (ide0:0) { min : 5866 max : 3212086692 mean : 535355431 count : 6 { 0 (<= 1) 0 (<= 10) 0 (<= 100) 0 (<= 500) 0 (<= 1000) 0 (<= 5000) 5 (<= 15000) 0 (<= 30000) 0 (<= 50000) 0 (<= 100000) 1 (> 100000) } }
This will output the histogram data to the console. Alternatively you can choose to output the data to .csv by running:
~ # vscsiStats -p all -c > /vmfs/volumes/SAN-VMFS-01/output.csv
Using the ‘-p all’ option will display all collected data in several histograms. The following metrics are represented:
- seekDistance – The distance in logical block numbers (LBN) that the disk head must travel to read or write a block. If a concentration of your seek distance is very small (less than 1), then the data is sequential in nature. If the seek distance happens to be varied, your level of randomization may be proportional to this distance travelled.
- ioLength – is the size of the I/O.
- outstandingIOs – This will help give you an idea of any queuing that is occurring.
- latency – is the time of the I/O trip
- interarrival – is the amount of time in microseconds between virtual machine disk commands.
Rather than display all metrics in the output, you can choose to only display the histograms related to the metric you are interested in by substituting ‘all’ for the name of the metric. For example:
/vmfs/volumes/5221687f-72547f8e-a71d-005056af71fa # vscsiStats -p latency Histogram: latency of IOs in Microseconds (us) for virtual machine worldGroupID : 5599, virtual disk handleID : 8194 (ide0:0) { min : 3203 max : 7280 mean : 6916 count : 13 { 0 (<= 1) 0 (<= 10) 0 (<= 100) 0 (<= 500) 0 (<= 1000) 1 (<= 5000) 12 (<= 15000) 0 (<= 30000) 0 (<= 50000) 0 (<= 100000) 0 (> 100000) } }
In the example above we can see that most I/Os are coming in at around 15 milliseconds latency.