Tuesday, September 9, 2008

Linux Performance Metrics

Linux Performance Metrics

Each set of performance metrics is averaged over an interval of one second.

CPU

The  agent uses the sar -urWqR 1 command to compare the system counters during a one-second interval. The statistics that the agent returns are are averaged for all CPUs on the system.

Metric

Explanation

Source

% USR

The percentage of time that the processor spends in user mode (a processing mode for applications and subsystems).

/proc/cpuinfo

% SYS

The percentage of time that the kernel spends processing system calls.

/proc/cpuinfo

% WIO

The amount of waiting time that a runnable process for a device takes to perform an I/O operation.

/proc/cpuinfo

% Total

The total amount of User %, System %, and Wait I/O %

/proc/cpuinfo

Run Queue Length

The percentage of time that one or more services or processes are waiting to be served by the CPU.

/proc/cpuinfo

Multi CPU

The  agent uses the sar and mpstat utilities on a Linux system to collect the metrics in the table below from Linux systems with multiple CPUs. The agent averages the statistics from each CPU using the sar -x SELF -I SUM -P ALL -wu 1 command, which compares the system counters during a one-second interval. The statistics that the agent returns are for the entire system, per CPU.

Metric

Explanation

User %

The percentage of CPU user processes that are in use.

System %

The percentage of CPU kernel processes that are in use.

Wait I/O %

The percentage of time that a process which can be run must wait for a device to perform an I/O operation.

SMTX

The number of read or write locks that a thread was not able to acquire on the first attempt, as reported by the mpstat command.

XCAL

The number of interprocess cross-calls. In a multi-processor environment, one processor sends cross-calls to another processor to get that processor to do work. Cross-calls can also be used to ensure consistency in virtual memory. Heavy file system activity such as NFS can result in a high number of cross-calls.

Interrupts

The number of CPU interrupts.

Total %

The total amount of User %, System %, and Wait I/O%.

Memory

The  agent uses the command sar -urWqR 1 to collect memory metrics from a Linux system. The command also compares the system counters during a one-second interval. The statistics that the agent returns are for the entire system.

Metric

Explanation

Source

Free Memory

The amount of physical memory available to the operating system, system library files, and applications.

/proc/meminfo

Cache Hit Rate

How often the system accesses the CPU cache.

/proc/meminfo

PageOut per Second

The rate at which pages were written to disk.

/proc/meminfo

PageIn per Second

The rate at which pages were read from or written to the disk.

/proc/meminfo

PageFree per Second

The number of pages that are freed from memory each second.

/proc/meminfo

PageScan per Second

The average number of pages that are scanned each second.

/proc/meminfo

Free Swap

The amount of available free swap space, as a percentage of total available free swap space.

/proc/meminfo

Disk

The  agent gathers file system statistics for each file system using the df -lk command. Disk statistics -- such as %busy and reads per second and writes per secong -- are output per disk and compared between polling intervals using the iostat -d -x 1 2 command.

Metric

Explanation

Source

Disk (Spindle) Name

The names of each disk on the system.

/proc/diskstats

Usage (% Busy)

The percentage of time during which the disk drive is handling read or write requests.

/proc/diskstats

Throughput (Blk/s)

The number of read and write operations on the disk that occur each second.

/proc/diskstats

Read/Writes/s

The average number of bytes that have been transferred to or from the disk during write or read operations.

/proc/diskstats

Average Queue Length

The number of threads that are waiting for processor time.

/proc/diskstats

Average Service Time

The average amount of time, in milliseconds, that is required for a request to be carried out.

/proc/diskstats

Average Wait Time

The average time, in milliseconds, that a transaction is waiting in a queue. The wait time is directly proportional to the length of the queue.

/proc/diskstats

Network

The  agent uses the netstat -s to get a combined total of TCP Retransmits for all network interfaces. Other network statistics -- like kbps, errors, and collisions -- are averaged, per interface, using the sar -n DEV -n EDEV 1 command. which compares the system counters during a one-second interval.

Metric

Explanation

Source

In Kbps

The rate, in kilobytes per seconds, at which data is received over a specific network adapter.

/proc/net

Out Kbps

The rate, in kilobytes per seconds, at which data is sent over a specific network adapter.

/proc/net

In Errors

The number of inbound packets that contained errors, which preventing those packets from being delivered to a higher-layer protocol.

/proc/net

Out Errors

The number of outbound packets that could not be transmitted because of errors.

/proc/net

Collisions

The number of signals from two separate nodes on the network that have collided.

/proc/net

TCP Retransmits

The number of packets that have been re-sent over a network interface. The agent returns a combined total for all interfaces.

/proc/net

Process

The  agent uses the ps -eo command to collect the process information listed in the table below from a Linux system. By default, the agent only gathers the top 20 processes and sorts them by the highest CPU usage.

Metric

Explanation

Source

PID

The unique identifier of a specific process.

/proc/stat

PPID

The identifier of the process that the process that is currently running.

/proc/stat

UID

A value that identifies the current user.

/proc/stat

GID

A value that identifies a group of users.

/proc/stat

Memory Consumed

The amount of memory that is being used by a process.

/proc/stat

RSS

The amount of physical memory that is being used by a process.

/proc/stat

CPU % Utilization by Process

The percentage of CPU time that is being used by individual processes.

/proc/stat

Memory % Utilization by Process

The amount of physical memory that is being used by individual processes.

/proc/stat

Process Start Time

The time at which the process started.

/proc/stat

Process Run Time

The time at which the process started.

/proc/stat

Number of Processes Running

The total number of processes that are currently running on the system.

/proc/stat

Number of Blocked Processes

The total number of processes that are blocking resources.

/proc/stat

Number of Waiting Processes

The total number of processes that are waiting to be executed by the CPU.

/proc/stat

Execs per Second

The total number of system calls that are executed each second.

/proc/stat

Process Creation Rate

The total number of processes that are being spawned over a specified time period.

/proc/stat

Workload

The  agent uses the ps utility to collect workload information from a Linux system. Workload statistics are sorted within 's core, but the statistics that are used are the same 20 processes that were gathered from the Process method. Included in the workload processes that the agent gathers are user/group/process name and their invividual statistics. 's core will then sort the statistics based on the graph that you want to generate (for example, user, group, or process name).

Metric

Explanation

Source

Workload by Process

The demand that network and local services are putting on a system, based on the processes that are running.

/proc/load

Workload by User

The demand that network and local services are putting on the system, based on the IDs of the users who are logged into a system.

/proc/load

Workload by Group

The demand that network and local services are putting on the system, based on the IDs of the user groups that are logged into a system.

/proc/load

Workload Top 10 by Process

The 10 processes that are consuming the most CPU resources.

/proc/load

Workload Top 10 by User

The 10 processes the are consuming the most CPU resources, based on user ID.

/proc/load

Workload Top 10 by Group

The 10 processes the are consuming the most CPU resources, based on group ID.

/proc/load

User

The  agent uses the following commands to collect user statistics from a system:

  • ps -eo
  • last | head 10 (login history for the last 10 users on the system)
  • who (lists who is currently logged into the system)

Metric

Explanation

Login History

The number of times or frequency at which a user has logged into a system during any 30 minute time interval.

Sessions

The number of sessions or number of distinct users who are logged into a system during any 30 minute time interval.

 

No comments: