Using vscsiStats to Gather Storage Performance Data

adminNovember 22, 20130

You can use the vscsiStats tool to gather storage performance data for VMFS and NFS datastores. The tool is now available on a default install of ESXi, and is located in the /usr/sbin directory. Running the tool, without any arguments, will display the usage options:

~ # vscsiStats
  VscsiStats -- This tool controls vscsi data collection for virtual machine disk
                disk I/O workload characterization. Users can start and stop online data
                data collection for particular VMs as well as print out online histogram data.
                Command traces can also be collected and printed.

  The following histogram related options are available:
     -h, --help will print the usage
     -l, --list will list the available virtual machines and their virtual disks
     -r, --reset will reset the stats
     -s, --start will start vscsiStats collection; exclusive of -x
     -x, --stop will stop vscsiStats collection; exclusive of -s
     -w , --worldgroupid specifies a worldID to use for this operation
     -i , --handleid specifies a vscsi handleID to use for this operation
           requires the -w option
     -p , --printhistos will print out the current histograms for the specified
            histogram type. May be used in conjunction with -w and -i.
            histoType must be one and only one of:
                 all, ioLength, seekDistance, outstandingIOs, latency, interarrival
     -c, --csv will use comma as delimiter in conjunction with -p

  The following command trace related options are available:
     -t, --tracecmds will start scsi cmd traces; in conjunction with -s
         Note:- the -t option consumes significant system resources so
                enabling it indefinitely is not advisable
              - try to limit the #virtual disk for which cmd tracing is enabled at any
                given time by using --worldgroupid and/or --handleid options.
              - trace contains NO customer sensitive data
                - only information recorded is:
                  - serialnumber, IO block size, number of scatter-gather elements
                  - command type, block number, timestamp
                - Therefore, actual data/payload of commands is not stored
              - If successfully started, log channel id(s) will be printed out.
                To store the command trace in a file for later processing, invoke:
                $ logchannellogger <log_channel_id> <binary_trace_file>
     -e , --traceprettyprint reads in a vscsi cmd trace from the given
        filename and sends a CSV formatted output to stdout; exclusive of all other options

  vscsiStats Usage:
         vscsiStats [options]

The first step to collecting data is to get the world id for a virtual machine you wish to monitor. You can list the world ids by running ‘vscsiStats -l’, as shown below:

~ # vscsiStats -l
Virtual Machine worldGroupID: 5599, Virtual Machine Display Name: TestVM04, Virtual Machine Config File: /vmfs/volumes/5221687f-72547f8e-a71d-005056af71fa/TestVM04/TestVM04.vmx, {
   Virtual SCSI Disk handleID: 8192 (ide0:0)
}
Virtual Machine worldGroupID: 5890, Virtual Machine Display Name: TestVM06, Virtual Machine Config File: /vmfs/volumes/5221687f-72547f8e-a71d-005056af71fa/TestVM06/TestVM06.vmx, {
   Virtual SCSI Disk handleID: 8193 (ide0:0)
}

In this example we can see there are two running virtual machines on this host, each with one virtual disk. We can start a collection on the first VMs disk by running:

~ # vscsiStats -s -w 5599
vscsiStats: Starting Vscsi stats collection for worldGroup 5599, handleID 8192 (ide0:0)
Success.

The collection will run for 30 minutes unless it is stopped before then. It can be stopped by running:

~ # vscsiStats -x -w 5599
vscsiStats: Stopping all Vscsi stats collection for worldGroup 5599, handleID 8192 (ide0:0)
Success.

If you wanted to extend the collection time you can run the start command again whilst the collection is already running:

~ # vscsiStats -s -w 5599
vscsiStats: Starting Vscsi stats collection for worldGroup 5599, handleID 8192 (ide0:0)
Success.

To reset the statistics without stopping the collection run:

~ # vscsiStats -r -w 5599
vscsiStats: Resetting Vscsi stats for worldGroup 5599, handleID 8192 (ide0:0)
Success.

So, that’s how to start, stop and reset the collection process. To view the data, at any point whilst the collection is running, use the -p switch:

~ # vscsiStats -p all | less
Histogram: latency of IO interarrival time for Writes in Microseconds (us) for virtual machine worldGroupID : 5599, virtual disk handleID : 8194 (ide0:0) {
 min : 5866
 max : 3212086692
 mean : 535355431
 count : 6
   {
      0                  (<=                  1)
      0                  (<=                 10)
      0                  (<=                100)
      0                  (<=                500)
      0                  (<=               1000)
      0                  (<=               5000)
      5                  (<=              15000)
      0                  (<=              30000)
      0                  (<=              50000)
      0                  (<=             100000)       1                  (>              100000)
   }
}

This will output the histogram data to the console. Alternatively you can choose to output the data to .csv by running:

~ # vscsiStats -p all -c > /vmfs/volumes/SAN-VMFS-01/output.csv

Using the ‘-p all’ option will display all collected data in several histograms. The following metrics are represented:

seekDistance – The distance in logical block numbers (LBN) that the disk head must travel to read or write a block. If a concentration of your seek distance is very small (less than 1), then the data is sequential in nature. If the seek distance happens to be varied, your level of randomization may be proportional to this distance travelled.
ioLength – is the size of the I/O.
outstandingIOs – This will help give you an idea of any queuing that is occurring.
latency – is the time of the I/O trip
interarrival – is the amount of time in microseconds between virtual machine disk commands.

Rather than display all metrics in the output, you can choose to only display the histograms related to the metric you are interested in by substituting ‘all’ for the name of the metric. For example:

/vmfs/volumes/5221687f-72547f8e-a71d-005056af71fa # vscsiStats -p latency
Histogram: latency of IOs in Microseconds (us) for virtual machine worldGroupID : 5599, virtual disk handleID : 8194 (ide0:0) {
 min : 3203
 max : 7280
 mean : 6916
 count : 13
   {
      0                  (<=                  1)
      0                  (<=                 10)
      0                  (<=                100)
      0                  (<=                500)
      0                  (<=               1000)
      1                  (<=               5000)
      12                 (<=              15000)
      0                  (<=              30000)
      0                  (<=              50000)
      0                  (<=             100000)       0                  (>              100000)
   }
}

In the example above we can see that most I/Os are coming in at around 15 milliseconds latency.

How to Use esxtop and resxtop

Working with Snapshots on the ESXi CLI

Related posts

VMware vSphere Virtual Machine Snapshots Explained

How to Enable SSH on All ESXi Hosts using PowerCLI

How to Install VMware Tools on Debian 11