How to Setup Collectl – Linux Performance Monitoring

Collectl is a linux performance monitoring tools that grabs as much detail as possible from the /proc filesystem and it does a lot more than most other tools. Compare to sar, collectl has some capabilities that sar does not have. Collectl can gather and post-process the performance data and also can save the performance data for later analysis. Please refer to below guide on how you can setup collectl on linux CentOS 6.5 and the sample usage of collectl :

1. Install collectl. Make sure additional repository (EPEL repository) has been installed :
a. For Red Hat based distro :

[root@oss ~]# yum install collectl -y

b. For debian bas distro :

[root@oss ~]# sudo apt-get install collectl -y

2. Display collectl command help :

[root@oss ~]# collectl -h
This is a subset of the most common switches and even the descriptions are
abbreviated.  To see all type 'collectl -x', to get started just type 'collectl'

usage: collectl [switches]
  -c, --count      count      collect this number of samples and exit
  -f, --filename   file       name of directory/file to write to
  -i, --interval   int        collection interval in seconds [default=1]
  -o, --options    options    misc formatting options, --showoptions for all
                                d|D - include date in output
                                  T - include time in output
                                  z - turn off compression of plot files
  -p, --playback   file       playback results from 'file' (be sure to quote
                              if wild carded) or the shell might mess it up
  -P, --plot                  generate output in 'plot' format
  -s, --subsys     subsys     specify one or more subsystems [default=cdn]
      --verbose               display output in verbose format (automatically
                              selected when brief doesn't make sense)

Various types of help
  -h, --help                  print this text
  -v, --version               print version
  -V, --showdefs              print operational defaults
  -x, --helpextend            extended help, more details descriptions too
  -X, --helpall               shows all help concatenated together

  --showoptions               show all the options
  --showsubsys                show all the subsystems
  --showsubopts               show all subsystem specific options
  --showtopopts               show --top options

  --showheader                show file header that 'would be' generated
  --showcolheaders            show column headers that 'would be' generated
  --showslabaliases           for SLUB allocator, show non-root aliases
  --showrootslabs             same as --showslabaliases but use 'root' names

Copyright 2003-2014 Hewlett-Packard Development Company, L.P.
collectl may be copied only under the terms of either the Artistic License
or the GNU General Public License, which may be found in the source kit

3. According to the man page, collectl identifies the following subsystems :

See also  How to Allow and Deny Access for Remote SSH to CentOS 6.2

SUMMARY SUBSYSTEMS

Option Description
b Buddy information (memory fragmentation)
c CPU information
d Disk
f NFS information
i inode information
j Interrupts
l Lustre
m Memory
n Networks
s Sockets
t TCP
x Interconnect
y Slabs

DETAIL SUBSYSTEMS

Option Description
C CPU
D Disk
E Environmentals via ipmitool
F NFS data
J Interrupts
M Memory node data (including numa)
N Networks
T Sixty-five TCP counters (only in plot format)
X Interconnect
Y Slabs
Z Processes

 

4. Monitor cpu subsystem :

[root@oss ~]# collectl -sc
waiting for 1 second sample...
#< --------CPU-------->
#cpu sys inter  ctxsw
   0   0    34     37
   0   0    56     40
   0   0    38     44
   0   0    31     35
   0   0    36     44
[root@oss ~]# collectl -sC
waiting for 1 second sample...

# SINGLE CPU STATISTICS
#   Cpu  User Nice  Sys Wait IRQ  Soft Steal Idle
      0     0    0    0    0    0    0     0  100
      1     0    0    0    0    0    0     0  100
      0     0    0    0    0    0    0     0  100
      1     0    0    0    0    0    0     0  100
      0     0    0    0    0    0    0     0  100
      1     0    0    0    0    0    0     0  100
      0     0    0    0    0    0    0     0  100
      1     0    0    0    0    0    0     0  100
      0     0    0    0    0    0    0     0  100
      1     0    0    0    4    0    0     0   96
      0     0    0    0    0    0    0     0   99
      1     0    0    0    0    0    0     0   99
      0     0    0    0    0    0    0     0  100
      1     0    0    0    0    0    0     0  100
      0     0    0    0    0    0    0     0  100
      1     1    0    0    0    0    0     0   99
      0     0    0    0    0    0    0     0  100
      1     0    0    0    0    0    0     0  100
      0     0    0    0    0    0    0     0  100
      1     0    0    0    0    0    0     0  100
      0     0    0    0    0    0    0     0  100
      1     0    0    0    0    0    0     0  100

5. Monitor memory subsystem :

[root@oss ~]# collectl -sm
waiting for 1 second sample...
#< -----------Memory----------->
#Free Buff Cach Inac Slab  Map
   3G  19M 166M  50M  36M  34M
   3G  19M 166M  50M  36M  34M
   3G  19M 166M  50M  36M  34M
   3G  19M 166M  50M  36M  34M
   3G  19M 166M  50M  36M  34M
   3G  19M 166M  50M  36M  34M
   3G  19M 166M  50M  36M  34M
   3G  19M 166M  50M  36M  34M
[root@oss ~]# collectl -sM
waiting for 1 second sample...

# MEMORY STATISTICS
# Node    Total     Used     Free     Slab   Mapped     Anon   Locked    Inact   Hit%
     0    4095M  565208K    3543M   37112K    9408K   25492K        0   51756K 100.00
     0    4095M  565208K    3543M   37112K    9408K   25492K        0   51756K 100.00
     0    4095M  565208K    3543M   37108K    9408K   25492K        0   51756K 100.00
     0    4095M  565208K    3543M   37108K    9408K   25492K        0   51756K 100.00
     0    4095M  565208K    3543M   37108K    9408K   25492K        0   51756K 100.00
     0    4095M  565208K    3543M   37108K    9408K   25492K        0   51760K 100.00
     0    4095M  565208K    3543M   37036K    9408K   25492K        0   51760K 100.00
     0    4095M  565184K    3543M   37028K    9408K   25492K        0   51760K 100.00
     0    4095M  565184K    3543M   37028K    9408K   25492K        0   51760K 100.00
     0    4095M  565184K    3543M   37024K    9408K   25492K        0   51760K 100.00
     0    4095M  565184K    3543M   37024K    9408K   25492K        0   51760K 100.00
     0    4095M  565184K    3543M   37016K    9408K   25492K        0   51760K 100.00
     0    4095M  565168K    3543M   36972K    9408K   25492K        0   51760K 100.00
     0    4095M  565168K    3543M   36972K    9408K   25492K        0   51760K 100.00
     0    4095M  565168K    3543M   36972K    9408K   25492K        0   51760K 100.00
     0    4095M  565168K    3543M   36968K    9408K   25492K        0   51760K 100.00
     0    4095M  565160K    3543M   36932K    9408K   25492K        0   51760K 100.00
     0    4095M  565160K    3543M   36932K    9408K   25492K        0   51760K 100.00
     0    4095M  565160K    3543M   36932K    9408K   25492K        0   51760K 100.00
     0    4095M  565160K    3543M   36900K    9408K   25492K        0   51760K 100.00
     0    4095M  565160K    3543M   36900K    9408K   25492K        0   51760K 100.00
     0    4095M  565160K    3543M   36900K    9408K   25492K        0   51760K 100.00

6. Monitor disk subsystem :

[root@oss ~]# collectl -sd
waiting for 1 second sample...
#< ----------Disks----------->
#KBRead  Reads KBWrit Writes
      0      0      0      0
      0      0      0      0
      0      0      0      0
      0      0      0      0
      0      0     16      3
      0      0      0      0
      0      0      0      0
      0      0      0      0
      0      0      0      0
      0      0      0      0
[root@oss ~]# collectl -sD
waiting for 1 second sample...

# DISK STATISTICS (/sec)
#           Pct
#Name       KBytes Merged  IOs Size  KBytes Merged  IOs Size  RWSize  QLen  Wait SvcTim Util
sda              0      0    0    0       0      0    0    0       0     0     0      0    0
sda              0      0    0    0       0      0    0    0       0     0     0      0    0
sda              0      0    0    0      16      1    3    5       5     1    13     10    3
sda              0      0    0    0       0      0    0    0       0     0     0      0    0
sda              0      0    0    0       0      0    0    0       0     0     0      0    0
sda              0      0    0    0       0      0    0    0       0     0     0      0    0
sda              0      0    0    0       0      0    0    0       0     0     0      0    0
sda              0      0    0    0       0      0    0    0       0     0     0      0    0
sda              0      0    0    0       0      0    0    0       0     0     0      0    0

7. collectl like iotop :

[root@oss ~]# collectl --top iokb

Sample output :

# TOP PROCESSES sorted by iokb (counters are /sec) 13:35:14
# PID  User     PR  PPID THRD S   VSZ   RSS CP  SysT  UsrT Pct  AccuTime  RKB  WKB MajF MinF Command
    1  root     20     0    0 S   18M    1M  0  0.00  0.00   0  00:01.13    0    0    0    0 /sbin/init
    2  root     20     0    0 S     0     0  0  0.00  0.00   0  00:00.02    0    0    0    0 kthreadd
    3  root     RT     2    0 S     0     0  0  0.00  0.00   0  00:00.04    0    0    0    0 migration/0
    4  root     20     2    0 S     0     0  0  0.00  0.00   0  00:00.09    0    0    0    0 ksoftirqd/0
    5  root     RT     2    0 S     0     0  0  0.00  0.00   0  00:00.00    0    0    0    0 migration/0
    6  root     RT     2    0 S     0     0  0  0.00  0.00   0  00:00.30    0    0    0    0 watchdog/0
    7  root     RT     2    0 S     0     0  1  0.00  0.00   0  00:00.38    0    0    0    0 migration/1
    8  root     RT     2    0 S     0     0  1  0.00  0.00   0  00:00.00    0    0    0    0 migration/1
    9  root     20     2    0 S     0     0  1  0.00  0.00   0  00:00.22    0    0    0    0 ksoftirqd/1
   10  root     RT     2    0 S     0     0  1  0.00  0.00   0  00:00.28    0    0    0    0 watchdog/1
   11  root     20     2    0 S     0     0  0  0.00  0.00   0  00:08.15    0    0    0    0 events/0
   12  root     20     2    0 S     0     0  1  0.00  0.00   0  01:21.61    0    0    0    0 events/1
   13  root     20     2    0 S     0     0  1  0.00  0.00   0  00:00.00    0    0    0    0 cgroup
   14  root     20     2    0 S     0     0  0  0.00  0.00   0  00:00.00    0    0    0    0 khelper
   15  root     20     2    0 S     0     0  1  0.00  0.00   0  00:00.00    0    0    0    0 netns
   16  root     20     2    0 S     0     0  0  0.00  0.00   0  00:00.00    0    0    0    0 async/mgr
   17  root     20     2    0 S     0     0  1  0.00  0.00   0  00:00.00    0    0    0    0 pm
   18  root     20     2    0 S     0     0  0  0.00  0.00   0  00:00.84    0    0    0    0 sync_supers
   19  root     20     2    0 S     0     0  0  0.00  0.00   0  00:00.82    0    0    0    0 bdi-default
   20  root     20     2    0 S     0     0  0  0.00  0.00   0  00:00.00    0    0    0    0 kintegrityd/0
   21  root     20     2    0 S     0     0  1  0.00  0.00   0  00:00.00    0    0    0    0 kintegrityd/1
   22  root     20     2    0 S     0     0  0  0.00  0.00   0  00:00.15    0    0    0    0 kblockd/0

Display only top 10 processes :

[root@oss ~]# collectl --top iokb,10

Sample output :

# TOP PROCESSES sorted by iokb (counters are /sec) 13:42:37
# PID  User     PR  PPID THRD S   VSZ   RSS CP  SysT  UsrT Pct  AccuTime  RKB  WKB MajF MinF Command
    1  root     20     0    0 S   18M    1M  0  0.00  0.00   0  00:01.13    0    0    0    0 /sbin/init
    2  root     20     0    0 S     0     0  0  0.00  0.00   0  00:00.02    0    0    0    0 kthreadd
    3  root     RT     2    0 S     0     0  0  0.00  0.00   0  00:00.04    0    0    0    0 migration/0
    4  root     20     2    0 S     0     0  0  0.00  0.00   0  00:00.09    0    0    0    0 ksoftirqd/0
    5  root     RT     2    0 S     0     0  0  0.00  0.00   0  00:00.00    0    0    0    0 migration/0
    6  root     RT     2    0 S     0     0  0  0.00  0.00   0  00:00.30    0    0    0    0 watchdog/0
    7  root     RT     2    0 S     0     0  1  0.00  0.00   0  00:00.39    0    0    0    0 migration/1
    8  root     RT     2    0 S     0     0  1  0.00  0.00   0  00:00.00    0    0    0    0 migration/1
    9  root     20     2    0 S     0     0  1  0.00  0.00   0  00:00.22    0    0    0    0 ksoftirqd/1
   10  root     RT     2    0 S     0     0  1  0.00  0.00   0  00:00.28    0    0    0    0 watchdog/1

Learn what fields the above list can be sorted :

[root@oss ~]# collectl --showtopopts
The following is a list of --top's sort types which apply to either
process or slab data.  In some cases you may be allowed to sort
by a field that is not part of the display if you so desire

TOP PROCESS SORT FIELDS

Memory
  vsz    virtual memory
  rss    resident (physical) memory

Time
  syst   system time
  usrt   user time
  time   total time
  accum  accumulated time

I/O
  rkb    KB read
  wkb    KB written
  iokb   total I/O KB

  rkbc   KB read from pagecache
  wkbc   KB written to pagecache
  iokbc  total pagecacge I/O
  ioall  total I/O KB (iokb+iokbc)

  rsys   read system calls
  wsys   write system calls
  iosys  total system calls

  iocncl Cancelled write bytes

Page Faults
  majf   major page faults
  minf   minor page faults
  flt    total page faults

Context Switches
  vctx   volunary context switches
  nctx   non-voluntary context switches

Miscellaneous (best when used with --procfilt)
  cpu    cpu number
  pid    process pid
  thread total process threads (not counting main)

TOP SLAB SORT FIELDS

  numobj    total number of slab objects
  actobj    active slab objects
  objsize   sizes of slab objects
  numslab   number of slabs
  objslab   number of objects in a slab
  totsize   total memory sizes taken by slabs
  totchg    change in memory sizes
  totpct    percent change in memory sizes
  name      slab names

8. collectl like top :

[root@oss ~]# collectl --top

Sample output :

# TOP PROCESSES sorted by time (counters are /sec) 13:45:00
# PID  User     PR  PPID THRD S   VSZ   RSS CP  SysT  UsrT Pct  AccuTime  RKB  WKB MajF MinF Command
 3266  root     20  2488    0 R  160M   21M  1  0.00  0.05   5  00:00.70    0    0    0   83 /usr/bin/perl
    1  root     20     0    0 S   18M    1M  0  0.00  0.00   0  00:01.13    0    0    0    0 /sbin/init
    2  root     20     0    0 S     0     0  0  0.00  0.00   0  00:00.02    0    0    0    0 kthreadd
    3  root     RT     2    0 S     0     0  0  0.00  0.00   0  00:00.04    0    0    0    0 migration/0
    4  root     20     2    0 S     0     0  0  0.00  0.00   0  00:00.09    0    0    0    0 ksoftirqd/0
    5  root     RT     2    0 S     0     0  0  0.00  0.00   0  00:00.00    0    0    0    0 migration/0
    6  root     RT     2    0 S     0     0  0  0.00  0.00   0  00:00.31    0    0    0    0 watchdog/0
    7  root     RT     2    0 S     0     0  1  0.00  0.00   0  00:00.39    0    0    0    0 migration/1
    8  root     RT     2    0 S     0     0  1  0.00  0.00   0  00:00.00    0    0    0    0 migration/1
    9  root     20     2    0 S     0     0  1  0.00  0.00   0  00:00.22    0    0    0    0 ksoftirqd/1
   10  root     RT     2    0 S     0     0  1  0.00  0.00   0  00:00.28    0    0    0    0 watchdog/1
   11  root     20     2    0 S     0     0  0  0.00  0.00   0  00:08.18    0    0    0    0 events/0
   12  root     20     2    0 S     0     0  1  0.00  0.00   0  01:21.97    0    0    0    0 events/1
   13  root     20     2    0 S     0     0  1  0.00  0.00   0  00:00.00    0    0    0    0 cgroup
   14  root     20     2    0 S     0     0  0  0.00  0.00   0  00:00.00    0    0    0    0 khelper
   15  root     20     2    0 S     0     0  1  0.00  0.00   0  00:00.00    0    0    0    0 netns
   16  root     20     2    0 S     0     0  0  0.00  0.00   0  00:00.00    0    0    0    0 async/mgr
   17  root     20     2    0 S     0     0  1  0.00  0.00   0  00:00.00    0    0    0    0 pm
   18  root     20     2    0 S     0     0  0  0.00  0.00   0  00:00.84    0    0    0    0 sync_supers
   19  root     20     2    0 S     0     0  0  0.00  0.00   0  00:00.83    0    0    0    0 bdi-default
   20  root     20     2    0 S     0     0  0  0.00  0.00   0  00:00.00    0    0    0    0 kintegrityd/0
   21  root     20     2    0 S     0     0  1  0.00  0.00   0  00:00.00    0    0    0    0 kintegrityd/1

More advance options can be found in the official collectl documentation :

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *