Saturday, September 25, 2004

IOSTAT - Checking Disk Performance

The iostat command is used for monitoring system input/output device loading by observing the time the physical disks are active in relation to their average transfer rates. It is similar both in format and in use to the vmstat command. The first line of iostat reflects a summary of statistics since boot time. Each subsequent report covers the time since the previous report. All statistics are reported each time the iostat command is run. The report consists of a tty and CPU header row followed by a row of tty and CPU statistics.

iostat 10 5

When this command is ran iostat will spend 10 seconds gathering data and reports a single line of statistics. This will continue 5 times. At least 5 seconds is needed for good accuracy.

extended disk statistics tty cpu
disk r/s w/s Kr/s Kw/s wait actv svc_t %w %b tin tout us sy wt id
sd0 2.6 3.0 20.7 22.7 0.1 0.2 59.2 6 19 0 84 3 85 11 0
sd1 4.2 1.0 33.5 8.0 0.0 0.2 47.2 2 23
sd2 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0
sd3 10.2 1.6 51.4 12.8 0.1 0.3 31.2 3 31

A description of the information reported is:

* disk: Disk device name.
* r/s, w/s: Average reads/writes per second.
* Kr/s, Kw/s: Average Kb read/written per second.
* wait: Time spent by a process while waiting for block
* (eg disk) I/O to complete. actv: Number of active requests in the hardware queue.
* %w: Occupancy of the wait queue.
* %b: Occupancy of the active queue with the device busy.
* svc_t: Service time (ms). Includes everything: wait time, active queue time, seek rotation, transfer time.
* us/sy: User/system CPU time (%).
* wt: Wait for I/O (%).
* id: Idle time (%).

With this information in hand an Admin can go about tuning the environment to better suit the needs of the business. There are some good tutorials on tuning and interpreting the results at the Princeton University site, Admins choice site, the softlookup site, and (I know this is for an AIX system but the same ideas will apply here as well) the AIX performance Tuning guide.

*********************************** UPDATE ********************
The systat home page can be found here. With this utility comes the iostat and some other utilities for monitoring Linux.

Tuesday, September 21, 2004

VMSTAT - Virtual Memory Statistics

Sticking with the theme of performance monitoring, I'm going to cover some of the other utilities used to monitor a system's and network's performance. The vmstat utility gives a server-wide view of performance. vmstat reports information about processes, memory, paging, block IO, traps, and cpu activity. The first report produced gives averages since the last reboot. Additional reports give information on a sampling period of length delay. The process and memory reports are instantaneous.

vmstat 10 5

When this command is ran vmstat will spend 10 seconds gathering data and reports a single line of statistics. This will continue 5 times. At least 5 seconds is needed for good accuracy. Example output(unfortuantly blooger doesn't format the table properly):

procs memory page faults cpu
----- ----------- ------------------------ ------------ -----------
r b avm fre re pi po fr sr cy in sy cs us sy id wa
1 0 22478 1677 0 0 0 0 0 0 188 1380 157 57 32 0 10
1 0 22506 1609 0 0 0 0 0 0 214 1476 186 48 37 0 16
0 0 22498 1582 0 0 0 0 0 0 248 1470 226 55 36 0 9


FIELD DESCRIPTIONS (These field descriptions came straight from the Man pages)
Procs
r: The number of processes waiting for run time.
b: The number of processes in uninterruptable sleep.
w: The number of processes swapped out but otherwise runnable. This
field is calculated, but Linux never desperation swaps.

Memory
swpd: the amount of virtual memory used (kB).
free: the amount of idle memory (kB).
buff: the amount of memory used as buffers (kB).

Swap
si: Amount of memory swapped in from disk (kB/s).
so: Amount of memory swapped to disk (kB/s).

IO
bi: Blocks sent to a block device (blocks/s).
bo: Blocks received from a block device (blocks/s).

System
in: The number of interrupts per second, including the clock.
cs: The number of context switches per second.

CPU
These are percentages of total CPU time.
us: user time
sy: system time
id: idle time

For CPU issues pay attention to the processes in the run queue (procs r), User time (cpu us), System time (cpu sy), & Idle time (cpu id). If the r column is higher than the number of CPU's in the machine adding some more CPU's would help out. From here you can find out what is using your CPU time with TOP.
For Memory Issues pay attention to the scan rate (sr). The scan rate is the pages scanned by the clock algorithm per second. If the scan rate (sr) is continuously over 200 pages per second then there is a memory shortage. Remember that the free column will dwindle down due to memory being used for I/O cache and buffers, this is normal.

These statistics will go a long way to giving you an idea as to how well your system is responding to requests and if there are any bottle necks it will help give clues as to where to start looking for the problem areas. Hopefully in the next couple of days I can also cover the iostat and netstat commands to help give you an overall look at a sytem and how well it is working.