Skip to content
This repository has been archived by the owner on Jan 15, 2019. It is now read-only.

[dev.icinga.com #2909] run host and service checks through a worker #1042

Closed
icinga-migration opened this issue Jul 27, 2012 · 4 comments
Closed
Milestone

Comments

@icinga-migration
Copy link

This issue has been migrated from Redmine: https://dev.icinga.com/issues/2909

Created by mfriedrich on 2012-07-27 12:33:47 +00:00

Assignee: mfriedrich
Status: Closed (closed on 2012-08-22 18:19:27 +00:00)
Target Version: exp
Last Update: 2012-08-22 18:19:27 +00:00 (in Redmine)


in order to run host and service checks through workers, we need to achieve the following:

  • start workers on icinga core startup
  • init checkresult struct
  • save the check info
  • pass checkresult, processed_command, macros to worker queue within wproc_run_check
  • free command buffer and volatile macros

this will remove the following

  • embedded perl
  • fork-fork when executing a check with popen or execvp
  • service check sighandler (timeouts are handled within the workers)
  • host check sighandler is kept for on demand host checks only

overall, offloading the checks to the workers as jobs should generated a huge performance improvement, especially since the workers are rather small in memory, allowing faster forks, meaning to say check execution time will be reduced as well.

based on the implemented core workers, using the worker library from libicinga, as well as the underlaying architecture. originally implemented by Andreas Ericsson, this stays experimental on icinga dev/* branches until further performance measurements as well as debugging.

Fascinating observation; With a large process (+100MB) we can achieve
1121 fork()'s per second. With a small process as parent, we can
achieve 13891 fork()'s per second, best of five runs on my laptop.

Even accounting for the message passing, which has a constant overhead
of about 50 nanoseconds (two memcpy() and some function call overhead),
this is a clearly superior solution, even if we don't account for the
fact that we're now capable of utilizing parallelization as well, so
we can multiply the number of fork()'s with the number of online cpu's
minus 1 instead of having it fixed at 1.

Naturally, performance will decline when the fork()'ed programs do
something sensible instead of just exiting, but since the same program
was used to measure fork() performance with large and small processes
we can still expect a huge improvement.

Changesets

2012-07-27 13:29:20 +00:00 by mfriedrich 053c9c1986d759b1e4591e41f13c8e39aa59afa3

core: run host and service checks through a worker, drop embedded perl #2909

in order to run host and service checks through workers, we need to
achieve the following:

- start workers on icinga core startup
- init checkresult struct
- save the check info
- pass checkresult, processed_command, macros to worker queue within
  wproc_run_check
- free command buffer and volatile macros

this will remove the following
- embedded perl
- fork-fork when executing a check with popen or execvp
- service check sighandler (timeouts are handled within the workers)
- host check sighandler is kept for on demand host checks only

overall, offloading the checks to the workers as jobs should generated a
huge performance improvement, especially since the workers are rather
small in memory, allowing faster forks, meaning to say check execution
time will be reduced as well.

based on the implemented core workers, using the worker library from
libicinga, as well as the underlaying architecture. originally
implemented by Andreas Ericsson, this stays experimental on icinga dev/*
branches until further performance measurements as well as debugging.

refs #2909

2012-07-27 18:46:28 +00:00 by mfriedrich 00f9934a52a098953161fc6ea2b6954db20d6edc

core: log as info how many workers have spawned

refs #2904
refs #2909

Relations:

@icinga-migration
Copy link
Author

Updated by mfriedrich on 2012-07-27 13:27:14 +00:00

TODOs

  • make the number of workers configurable via config option
  • check how on-demand hostcheck sighandlers on alarm timeouts can handle the checkresult directly

@icinga-migration
Copy link
Author

Updated by mfriedrich on 2012-07-27 16:47:11 +00:00

incredibly fast.

[1343407513.485496] [024.1] [pid=11974] Run a few checks before executing a service check for 'StartupDelay'.
[1343407513.485531] [008.0] [pid=11974] ** Service Check Event ==> Host: 'localhost', Service: 'StartupDelay', Options: 0, Latency: -59.515000 sec
[1343407513.485545] [016.0] [pid=11974] Attempting to run scheduled check of service 'StartupDelay' on host 'localhost': check options=0, latency=-59.515000
[1343407513.485596] [016.0] [pid=11974] Checking service 'StartupDelay' on host 'localhost'...
[1343407513.487878] [016.0] [pid=11974] ** Handling check result for service 'StartupDelay' on host 'localhost'...
[1343407513.487884] [016.1] [pid=11974] HOST: localhost, SERVICE: StartupDelay, CHECK TYPE: Active, OPTIONS: 0, SCHEDULED: Yes, RESCHEDULE: Yes, EXITED OK: Yes, RETURN CODE: 0, OUTPUT: OK: Icinga started with $((1343407040-1343407039)) seconds delay | delay=$((1343407040-1343407039))

@icinga-migration
Copy link
Author

Updated by mfriedrich on 2012-08-07 16:50:12 +00:00

  • Target Version set to exp

@icinga-migration
Copy link
Author

Updated by mfriedrich on 2012-08-22 18:19:27 +00:00

  • Status changed from Assigned to Closed

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

1 participant