Collectd – Daemon to collect System Performance Statistics
- Collects CPU, Memory, Disk, Network, Per Process stats (Regex), Postgresql, mongodb, turbostat, qpid, foreman, DynFlow, Passenger, Puppet, Tomcat, collectd..etc
Graphite/Carbon
- Carbon receives metrics, and flushes them to whisper database files
- Graphite is webapp frontend to carbon
Grafana – Visualize metrics from multiple backends.
- Dashboards saved in json and customized by Ansible during deployment

2) Architecture

3) How do i configure performance?

Archit has come up with a nice blog for configuration

Description of metrics collected in satperf:
http://arcolife.github.io/blog/2016/10/05/monitoring-in-satperf-metrics-collection

Monitoring Setup: http://arcolife.github.io/blog/2016/08/22/setting-up-collectd-plugins-for-red-hat-satellite-with-graphite-and-grafana/

4) Example Graphs

4.1 ) Passenger Mem

4.2) Postgresql DB (candlepin & foreman)

4.3) Candlepin DB

4.4) Puppet Registrations

4.2) Dynflow Mem

Thanks to Archit, Jhutar for providing inputs & help!

Posted by

Pradeep Kumar Surisetty

Posted on

September 3, 2016

Posted under

Satelitte, Uncategorized

Comments

2 Comments

RedHat Satellite 6.2 Considerations for Large scale deployments

Red Hat Satellite is a complete system management product that allows system administrators to manage the full life cycle of Red Hat deployments across physical, virtual, and private clouds. Red Hat Satellite delivers system provisioning, configuration management, software management, and subscription management- all while maintaining high scalability and security. Satellite 6.2 is third major release of the next generation Satellite with a raft of improvements that continue to narrow the gaps in functionality found in Satellite 5 in many critical areas of the product. This Blog provides basic guidelines and considerations for tuning Red Hat Satellite 6.2 & capsule for Large scale deployments

1) Increase open-files-limit for Apache with systemd on satellite & Capsule server

# cat /etc/systemd/system/httpd.service.d/limits.conf

[Service]

LimitNOFILE=1000000

# systemctl daemon-reload

# katello-service restart

2) Increase open-files-limit for Qpid with systemd on satellite & Capsule server

# cat /etc/systemd/system/qpidd.service.d/limits.conf

[Service]

LimitNOFILE=1000000

# systemctl daemon-reload

# katello-service restart

3) Increase postgresql shared_buffer

While registering content hosts at scale to Satellite server, shared_buffers needs to be set appropriately in postgresql.conf. Recommended: 256 MB

4) Increase postgresql max_connections

When registering content hosts at scale, it is recommended to increase max_connections setting (set to 100 by default) as per your needs and HW profile. For example, you might need to set the value to 200 when you are registering 200 content hosts in parallel.

5) Storage planning for qpid

When you use katello-agent extensively, plan storage capacity for /var/lib/qpidd in advance. Currently, in Satellite 6.2 /var/lib/qpidd requires 2MB disk space per a content hos.

6) Increase open-files-limit for Qpid Dispatch Router with systemd on satellite & Capsule server

# cat /etc/systemd/system/qdrouterd.service.d/limits.conf

[Service]

LimitNOFILE=1000000

# systemctl daemon-reload

# katello-service restart

Special Thanks to Jan Jutar & Archit Sarma for the help to get scale numbers.

Posted by

Pradeep Kumar Surisetty

Posted on

May 13, 2016

Posted under

Uncategorized

Comments

Leave a comment

RedHat Satellite 6.2 Install

Install CLI options moved from katello to satellite in 6.2 
Enable required repositories. 

For ex: 
yum install satellite
satellite-installer --scenario satellite --foreman-admin-password changeme --force

Capsule install:

yum install satellite-capsule
satellite-installer --scenario capsull

Posted by

Pradeep Kumar Surisetty

Posted on

January 7, 2016

Posted under

Uncategorized

Comments

1 Comment

External Snapshot of raw images

When external snapshot of raw image is taken, delta is taken into qcow2 files.

virsh # list
Id    Name                           State
—————————————————-
4     cbtool                         running
6     master                         running

virsh # snapshot-create-as master snap1-master “snap1” –diskspec vda,file=/home/snap1.qcow2 –disk-only –atomic

Domain snapshot snap2-master created

snapshots tree :

virsh # snapshot-list master –tree
snap1-master

virsh # snapshot-create-as master snap2-master “snap2” –diskspec vda,file=/home/snap2.qcow2 –disk-only –atomic
Domain snapshot snap2-master created
virsh # snapshot-list master –tree
snap1-master
|
+- snap2-master

Image info:

qemu-img info /home/snap2.qcow2
image: /home/snap2.qcow2
file format: qcow2
virtual size: 10G (10737418240 bytes)
disk size: 196K
cluster_size: 65536
backing file: /home/snap1.qcow2
backing file format: qcow2
Format specific information:
compat: 1.1
lazy refcounts: false
refcount bits: 16
corrupt: false

How to Delete:

virsh # snapshot-list master
Name Creation Time State
————————————————————
snap2-master 2016-01-07 03:38:10 -0500 disk-snapshot

virsh # snapshot-delete master snap2-master –metadata
Domain snapshot snap2-master deleted

Posted by

Pradeep Kumar Surisetty

Posted on

December 30, 2015

Posted under

Uncategorized

Comments

Leave a comment

2015 year in blogging [psuriset.com]

http://psuriset.com/2015/annual-report/

Posted by

Pradeep Kumar Surisetty

Posted on

December 20, 2015

Posted under

Uncategorized

Comments

Leave a comment

Starting MongoDB on CentOS with NUMA disabled

Have been noticing this every time i run mongodb on centos/RHEL or any other.

Error:
Sun Dec 20 06:26:16.832 [initandlisten] ** WARNING: You are running on a NUMA machine.
Sun Dec 20 06:26:16.832 [initandlisten] ** We suggest launching mongod like this to avoid performance problems:
Sun Dec 20 06:26:16.832 [initandlisten] ** numactl –interleave=all mongod [other options]
Sun Dec 20 06:26:16.832 [initandlisten]

TO avoid performance issues, its recommended to run mongodb with interleaving memory across all NUMA nodes.

Dint find any way to solve this.
Kill existing mongod and restart mongodb with below.

numactl –interleave=all runuser -s /bin/bash mongodb -c “/usr/bin/mongod –dbpath /var/lib/mongodb”

Posted by

Pradeep Kumar Surisetty

Posted on

November 16, 2015

Posted under

Uncategorized

Comments

Leave a comment

Number of io requests for each io_submit

Async io use io_submit calls. aio=native is used for async io. To get number of io requests for each io_submit from KVM VM,
here you go. My seq write 4k run on SSD. While capturing IOPS, make sure to trace io_submit perf events using below.
sys_enter_io_submit, sys_exit_io_submit are mandatory. Essentially each io_submit call followed by *_io_getevents
which are irrelevant to present topic.
 -e syscalls:sys_enter_io_submit -e syscalls:sys_exit_io_submit -e syscalls:sys_enter_io_getevents -e syscalls:sys_exit_io_getevents
Get Iops for this run.

write-4KiBIOPS 76971.9

Get Number of io_submits from captured perf.data. Number of enter_io_submits are fine. Ofcourse same number of exits will be there
[root@perf io-submit-write-4k]# perf script | grep io_submit | grep enter | wc -l
493370
[root@perf io-submit-write-4k]#
Get timestamp of io_submit events. (first and last)
First: 
qemu-kvm  3693 [025]  1914.589390: syscalls:sys_enter_io_submit: ctx_id: 0x7f3f18a61000, nr: 0x000000d1, iocbpp:
                     697 io_submit (/usr/lib64/libaio.so.1.0.1)
                       8 [unknown] ([unknown])
                       0 [unknown] ([unknown])

Last: 

qemu-kvm  3693 [001]  1949.737723: syscalls:sys_enter_io_submit: ctx_id: 0x7f3f18a61000, nr: 0x000000d1, iocbpp: 0x7ffd4e50b7b0
                 697 io_submit (/usr/lib64/libaio.so.1.0.1)
        7f3f1c6c9250 [unknown] ([unknown])
                   0 [unknown] ([unknown])

Number of submits per sec.
Time stamp diff: 1949.737723- 1914.589390 = 35.15
Number of submits:  493370

Submits/sec = 493370/35.15 = 14036.13086771
 IOPS metric is requests/second. We got submits per second. 
Number of requests per submit
requets/submission = (requests/sec) / (submits/sec)
                   =  76971.9 / 14036.130
                   =  5.483840631
So for my 4k write run number of requests per each submit are 5.48

Posted by

Pradeep Kumar Surisetty

Posted on

November 15, 2015

Posted under

Uncategorized

Comments

Leave a comment

iostat analysis: time spent for each IO request

These are one of the 4K write results to disk vdb (lvm volume on SSD which is irrelevent for present discussion).

“The average time (in milliseconds) for I/O requests issued to the device to be served. This includes the time spent by the requests in queue and the time spent servicing them

Wait_Time-vdb-write=0.042727

Throughput :

Throughput-vdb-write=65.902727

“svctm – The average service time (in milliseconds) for I/O requests that were issued to the device.

Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
vda               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
vdb               0.00     0.00 49409.00    0.00   193.00     0.00     8.00     7.04    0.14    0.14    0.00   0.02 100.00

Disk Utilization

Utilization-vdb=74.195455

Frame size; 4K

0.0427 ms = 42.7 microseconds. 42.7 microseconds per request
Throughput is 65 MB/s so 65 * 1024 / 4

= 16640 requests/s

1000000 microseconds/s / 16640 requests/s = 60 ms/request

60 ms/request * 0.75% disk utilization = 45 microseconds/request

So Time spend on each IO request is 45 microseconds/request

Thanks to Stefan

Pradeep K Surisetty

Category Archives: Uncategorized

Performance Tuning for Red Hat Satellite 6.5 and 6.6

Infrastructure performance and scaling tools you should be using

Performance Monitoring of Red Hat Satellite 6 using satperf

RedHat Satellite 6.2 Considerations for Large scale deployments

1) Increase open-files-limit for Apache with systemd on satellite & Capsule server

2) Increase open-files-limit for Qpid with systemd on satellite & Capsule server

3) Increase postgresql shared_buffer

4) Increase postgresql max_connections

5) Storage planning for qpid

6) Increase open-files-limit for Qpid Dispatch Router with systemd on satellite & Capsule server

RedHat Satellite 6.2 Install

External Snapshot of raw images

2015 year in blogging [psuriset.com]

Starting MongoDB on CentOS with NUMA disabled

Number of io requests for each io_submit

iostat analysis: time spent for each IO request