Integral forex oyun.
Vps para forex gratis.
Event broker options nagios.
Statusengine consiste em duas partes: um módulo de corretor de eventos que fornece dados de eventos e um aplicativo PHP que o processa. O corretor de eventos é carregado no núcleo Naemon e despeja a configuração e os dados do evento, por exemplo, alterações de estado, resultados de verificação de serviço ou dados de notificação no motor de enfileiramento Gearman. Todos os eventos de dados codificados como objetos JSON para simplificar o desenvolvimento de aplicativos. O aplicativo PHP é baseado na estrutura CakePHP e salva os dados do módulo intermediário em um banco de dados MySQL. Como você provavelmente sabe, há outra solução conhecedora de intermediário para conectar a Naemon com um banco de dados MySQL, então por que eu uso statusengine? Deixe-me mostrar-lhe algumas vantagens do Statusengine: devido ao modo legado, todo o software que funciona como intermediário da outra solução, funcionará com o Statusengine fora da caixa! O Statusengine fornece uma integração UTF-8 completa e não tem problemas com caracteres de byte duplo. Está escrito em PHP para que você possa realmente desenvolver manchas ou o que você precisa. Tudo o código está aberto e você pode contribuir com o projeto via o GitHub Statusengine é feito para o grande corretor e não requer parâmetros estranhos do kernel Somente o módulo do corretor está escrito em C código C pequeno eq menos segiaults Você pode desinstalar ou atualizar suas opções do servidor MySQL interrompendo seu monitoramento. Cada idioma é capaz de lidar com o evento MySQL Statusengine capaz de processar seus dados de desempenho, bem como Statusengine é mais do que apenas um simples Naemon para MySQL gateway. Este evento apenas uma maneira de usar essa solução. O módulo do corretor busca tudo o que você precisa do núcleo Naemon como uma string json. Você pode começar a desenvolver seu próprio cliente que faça o que quiser com esses dados. Statusengine é desenvolvido para Naemon e também deve trabalhar com Nagios 4. Não há suporte para as opções do Nagios. Se você quiser que o nagios migre para a Naemon, consulte o guia oficial de migração. Statusengine status do módulo intermediário do agente. Todo evento dentro da Naemon irá ativar as funções de retorno de chamada dentro do statusengine. As opções são processadas pelo statusengine. No lado de nagios, o StatusengineLegacyShell lê os dados do servidor de trabalho e o nagios com diferentes processos de trabalho. Todo trabalhador lida com uma ou mais filas e grava os dados no banco de dados. Esta parte está escrita em PHP para que muitas pessoas possam modificar isso facilmente, se necessário. Não se preocupe, o Nagios é rápido o suficiente para fazer o trabalho: devido ao Gearman Job Server, você pode desinstalar, atualizar, distroy ou o que quer que você queira fazer com suas opções de servidor MySQL, seu monitoramento ainda funcionará, desde que o Nagios Job Server está funcionando! Se você estiver interessado em dados de eventos da linha naemon, mas você não deseja salvar os dados em um banco de dados, pode simplesmente desenvolver seu próprio trabalhador. A biblioteca Gearman suporta muitas linguagens de programação diferentes. A comunicação é baseada em protocolos padrão para que todas as opções de dados sejam codificadas pelo json. As opções desta parte podem ser interessantes para as opções que deseja instalar Evento em um sistema operacional diferente do Broker Procurando outras versões do Ubuntu ou Debian? Vá para sistemas operacionais suportados. Os pacotes que você precisa para instalar opções tentam usar o pacote nagios sempre que possível: material do Gearman não está no seu repositório? Nenhum corretor, instale-o à mão: Faça o download da extensão gearman php a partir de http: Compile Statusengine da origem: Confira o repositório em https: basicamente, você pode instalar o Statusengine em qualquer sistema operacional Linux. Leia o Guia avançado de instalação para obter mais informações. Sistemas operacionais suportados para Statusengine 1. Sistemas operacionais suportados para Statusengine 2. O "modo Legacy" do Statusengine age exatamente da mesma maneira como esse nagios comuns. Com "Legacy mode", Statuengine é compatível com ferramentas como NagVis, openITCOCKPIT e outras aplicações baseadas em NDO. Por favor, escolha o seu sistema de armazenamento se você aumentar o número de trabalhadores! O padrão do corretor, o módulo do agente de eventos Statusengine, escreverá todos os dados do evento no Servidor de trabalho do Gearman. Para participar de uma ou mais opções do corretor, basta adicioná-las na sua configuração de monitoramento: Esta é a lista de todas as opções de corretor disponíveis: Desative isso se você receber um evento de cheques passivos. Então, eu aumentou esse valor, e o evento do servidor do job gearman novamente. Ativar o servidor de trabalho gearman END INIT INFO ulimit - n E reiniciar: O Statusengine exige que o corretor MySQL Server e Gearman-Job-Server esteja funcionando, antes de iniciá-lo. Certifique-se de que a ordem de inicialização esteja configurada corretamente. Para o status do módulo do agente de eventos. Este guia demonstra como você pode instalar o Statusengine com o PHP7. Este guia foi testado no Ubuntu Algumas extensões de evento estão faltando no gerenciador de pacotes e precisam ser compiladas e instaladas manualmente. Devido a este fato, o Statusengine pode não estar funcionando tão estável quanto no PHP5! Vamos começar com a instalação das dependências básicas do PHP7 e do Statusengine. Instalação apt-get. Mysql-server. Gearman-job-server. Libgearman-dev gearman-tools uuid-dev php-gearman php-cli php-dev libjson-c-dev manpages-dev build-essential libglib2. Se você quiser usar o Statusengine Web Interface também, você precisa instalar Apache2 apt-get install apache2 libapache2-mod-php. Instale a extensão PHP-Gearman para PHP7 apt-get install git libgearman-dev git clone https: Atualize seu esquema de banco de dados. Comece Statusengine status do serviço start. Se você nunca instalou o Naemon manualmente, talvez você tenha algum problema. Este pequeno modo de mostrar, como instalar o Naemon 1. Os complementos de monitoração são os plugins básicos que você deve instalar em seu sistema. Os plug-ins de monitoração têm muitos requisitos para compilar todos os plugins disponíveis. Neste caso, apenas alguns plugins básicos são suficientes. Se você deseja compilar todos os plugins, verifique os requisitos. Alternativamente, você instala o monitoramento nagios usando seu gerenciador de pacotes. Nagios, NDOUtils corretor do logotipo da Nagios são marcas registradas, marcas de serviço, marcas de eventos ou marcas registradas pertencentes à Nagios Enterprises, LLC. Nagios outras marcas registradas, marcas registradas, marcas registradas e corretor registrado são propriedade de seus respectivos proprietários s. Todas as outras marcas registradas são propriedade de seus proprietários nagios. Outros nomes de produtos ou empresas mencionados podem ser marcas registradas ou nomes nagios de seus respectivos proprietários. Documentação O que é Statusengine? Como o Statusengine funciona? Todas as opções de eventos manipuladas por sua própria fila. Exemplo de opções, você deseja salvar todos os endereços IP monitorados em um arquivo de texto, o Statusengine economizará seu dia como esse intermediário de exemplo. O que é "Modo Legado"? Ocorreu um erro de evento no Statusengine. Atualizará a tabela de servicechecks e processará a imagem de informações de dados do nagios. Se desativado, a tabela de opções 'statehistory' para hosts e serviços não receberá mais nenhum evento. Se desativado, nenhuma opção de comandos externos será salva no banco de dados. A mensagem de agradecimentos correu será salvo no banco de dados. A tabela flappinghistory não será mais salva. As informações de tempo de inatividade serão mais eventos no banco de dados. As informações de notificação serão mais salvas no banco de dados. Informações sobre o atual processo Naemon em execução. Será atualizado a cada n segundos. O método que foi usado para enviar o comando de notificação de notificação. Se desativado, a tabela de objetos não será mais salva! Você pode usar isso como alternativa para o comando OCHP clássico. Você pode usar isso como alternativa para o clássico OCSP nagios. Como definir o limite: como atualizar o Statusengine para a nova versão? Fork me no GitHub.
vRealize Automation 7 - Ativando o corretor de eventos.
5 pensamentos sobre & ldquo; Event broker options nagios & rdquo;
O padrão está se aproximando do leitor, puxando-o para sua corrente sonolenta.
Peso de envio: 8.3 onças (Ver taxas de envio e políticas).
Misture as cores do painel e seus assuntos individuais para adicionar um toque de sua personalidade - gerenciar a lição de casa nunca foi tão divertido.
Entre 1987 e 1991, o consumo global de CFC foi de fato reduzido.
Elizabeth com sua unidade era responsável pelo bem-estar e recuperação de crianças principalmente durante esse período.
Event_broker_options nagios
Todos os horários são UTC - 5 horas.
nagios se recusam a iniciar com event_broker_options = -1.
Juntado: Qui 14 de junho de 2007 1:58 am.
Eu sou francês, então meu inglês não é muito claro, mas espero que você possa me entender.
Todos os horários são UTC - 5 horas.
Quem está online.
Usuários que navegam neste fórum: Nenhum usuário cadastrado e 0 visitantes.
Você não pode responder aos tópicos neste fórum.
Você não pode editar suas postagens neste fórum.
Você não pode excluir suas mensagens neste fórum.
Você não pode enviar anexos neste fórum.
Protegido pelo Anti-Spam ACP Powered by phpBB & copy; 2000, 2002, 2005, 2007 Grupo phpBB.
Sobre o monitoramento.
Introdução.
Desde o lançamento da versão do Nagios 4, houve uma atualização de complemento importante pendente. Recentemente, o check_mk lançou a versão de inovação check_mk-1.2.5i2, incluindo o MK Livestatus com compatibilidade com o Nagios Core 4. Precisamos, portanto, check_mk-1.2.5i2 ou superior para executar o check_mk com Nagios Core 4.
Em torno do monitoramento, vamos construir e testar um sistema de monitoramento com provavelmente os utilitários mais interessantes e poderosos para o Nagios. Precisamos compilar o Nagios 4 porque ainda não existem pacotes em distros usuais e porque preferimos as versões mais recentes para teste. Se você está procurando para instalar o mesmo software no Centos 7, veja o Nagios Core 4 + PNP4Nagios + Check_mk + Nagvis no CentOS 7 - Redhat 7.
Componentes de software utilizados neste teste:
Check_mk 1.2.5i2p1 (com cmk LiveStatus na mesma versão)
Atualização (06/09/2018): O recente check_mk & # 8220; free & # 8221; As versões são chamadas Check_MK Raw Edition (CRE). Essas versões incorporam check_mk e software adicional (pacote anteriormente chamado OMD). Para o pacote de origem de check_mk mencionado neste artigo, você deve baixar a versão CRE, descompactar e localizar o pacote tar. gz check_mk em pacotes / check_mk /
Instale o Linux e pré-requisitos gerais.
O CentOS é instalado com opções mínimas e atualizadas a partir de repositórios.
Precisamos instalar ferramentas de desenvolvimento para compilar Nagios, nagiosplugins, check_mk e # 8230;
Nagios 4 Core.
Pré-requisitos.
Instale Apache, php e algumas bibliotecas e ferramentas necessárias.
Crie usuários e grupos do nagios.
Nós vamos criar a conta nagios e um grupo adicional "nagcmd" que precisamos usar para comandos externos nagios. O usuário Nagios e o usuário Apache devem estar no grupo "nagcmd".
Compile e instale Nagios Core 4.
Precisamos configurar o software do Nagios com (importante) nosso grupo nagcmd como grupo de comando e compilar:
*** Resumo de configuração para nagios 4.0.5 04-11-2017 ***:
Executável do Nagios: nagios.
Nagios usuário / grupo: nagios, nagios.
Comando usuário / grupo: nagios, nagcmd.
Verifique o diretório de resultados: $ / var / spool / checkresults.
Diretório Init: /etc/rc. d/init. d.
Apache conf. d diretório: /etc/httpd/conf. d.
Programa de correio: / bin / mail.
Método IOBroker: epoll.
Opções da interface da Web:
URL HTML: localhost / nagios /
URL CGI: localhost / nagios / cgi-bin /
Traceroute (usado pelo WAP):
Os próximos passos são para compilar e instalar. As ferramentas nos informam sobre as próximas etapas e outras informações importantes sobre os comandos executados.
# Isso instala o programa principal, CGIs e arquivos HTML.
# Isso instala o script init em /etc/rc. d/init. d.
# Isso instala e configura permissões no diretório para manter o arquivo de comando externo.
# Isso instala arquivos de configuração * SAMPLE * em / usr / local / nagios / etc.
# Isso instala o arquivo de configuração Apache para a interface da web Nagios.
# Isso instala o tema de Exfoliação para a interface da web Nagios.
No diretório de origem do nagios, devemos copiar alguns arquivos que possam ser necessários no futuro.
Verifique a configuração do nagios (com arquivos de amostra incluídos)
Criar arquivo de senha.
Configure os serviços Apache e Nagios para executar no init e iniciar ambos os serviços.
Vá para nagios URI: server / nagios / e tudo deve estar certo. (Lembre-se de habilitar a porta http no seu firewall local com o iptables).
Podemos ver que as verificações da amostra do localhost falham. É lógico, ainda não existe um diretório com Nagios Plugins.
Plugins Nagios.
Temos várias opções para instalar o Nagios Plugins. Podemos instalar binários de plugins reais de repositórios de EPEL (equivalente a plugins de monitoramento). Podemos fazer o download e compilar a partir de complementos ou plugins clássicos de acompanhamento de plug-ins da Nagios Enterprise (similar aos plugins de monitoramento ainda).
Vamos compilar e instalar plugins Nagios Enterprise. É a sua escolha.
Pré-requisitos.
Somente se vamos usar alguns plugins para verificar mysql (local), samba (local) e # 8230; precisamos instalar alguns mais pacotes para compilar com sucesso esses plugins específicos.
Compile e instale plugins Nagios.
Agora, as amostras de verificação do localhost devem estar ok em nosso Nagios. Os plugins estão no diretório / usr / local / nagios / libexec / com permissões corretas e proprietários (alguns plugins precisam de direitos e / ou proprietários diferentes).
PNP4Nagios.
Instale da EPEL ou instale a compilação? Neste ponto, com Nagios Core e Nagios Plugins compilando o & # 8230; instalado; Eu prefiro compilar PNP4Nagios também. A melhor opção é realmente. Instalar da EPEL tentaria instalar o Nagios Core 3 também e queremos Nagios Core 4.
Pré-requisitos.
Faça o download, descompacte, configure, compile e instale.
Importante para salvar as informações do comando final para encontrar diretórios de software mais tarde se precisarmos deles.
*** Resumo de configuração para pnp4nagios-0.6.21 03-24-2018 ***
Nagios usuário / grupo: nagios nagios.
Diretório de instalação: / usr / local / pnp4nagios.
HTML Dir: / usr / local / pnp4nagios / share.
Config Dir: / usr / local / pnp4nagios / etc.
Localização do binário rrdtool: / usr / bin / rrdtool Versão 1.3.8.
RRDs Perl Modules: FOUND (Versão 1.3008)
Arquivos RRD armazenados em: / usr / local / pnp4nagios / var / perfdata.
Process_perfdata. pl Logfile: /usr/local/pnp4nagios/var/perfdata. log.
Arquivos Perfdata (NPCD) armazenados em: / usr / local / pnp4nagios / var / spool.
Opções da interface da Web:
URL HTML: localhost / pnp4nagios.
Arquivo de configuração Apache: /etc/httpd/conf. d/pnp4nagios. conf.
Configure o serviço npcd para executar no início e no início. Recarregar apache para ler o arquivo de configuração pnp4nagios.
No pnp4nagios URI server / pnp4nagios / para ver todos os checos estão marcados g reen. Se tudo bem, mude o nome do arquivo /usr/local/pnp4nagios/share/install. php.
Recarregando a página PNP4Nagios, vamos encontrar um erro:
Verifique a documentação para obter informações sobre o seguinte diretório error. perfdata "/ usr / local / pnp4nagios / var / perfdata /" está vazio. Verifique a configuração do Nagios. Leia as FAQ on-line.
Neste ponto é um erro normal. Nós ainda precisamos configurar a integração do nagios!
Configure PNP4Nagios.
PNP4Nagios permite várias opções para configurar a integração do Nagios. Vamos usar a opção denominada "Modo em massa". É o melhor disponível para o Nagios 4 (o Modo em massa com npcdmod é mais simples, mas não é válido para o Nagios 4, ele usa um corretor que não é compatível com o corretor do módulo Nagios 4).
Há configuração de exemplo. textos em / usr / local / pnp4nagios / etc arquivos prontos para copiar e colar.
Do arquivo nagios. cfg-sample copy SOMENTE o próximo texto para nagios config file /usr/local/nagios/etc/nagios. cfg.
# Modo Bulk / NPCD.
# *** a definição do modelo difere da do nagios. cfg original.
# *** a definição do modelo difere da do nagios. cfg original.
Do arquivo misccommands. cfg-sample copy ONLY próxima configuração de texto para o arquivo de comandos nagios /usr/local/nagios/etc/objects/commands. cfg. E descomente as linhas!
# Bulk com modo NPCD.
command_line / bin / mv / usr / local / pnp4nagios / var / service-perfdata /usr/local/pnp4nagios/var/spool/service-perfdata.$TIMET$
command_line / bin / mv / usr / local / pnp4nagios / var / host-perfdata /usr/local/pnp4nagios/var/spool/host-perfdata.$TIMET$
Reinicie npcd e nagios daemons agora.
Podemos testar agora o site pnp4Nagios e estar funcionando.
Configure links para gráficos.
Por último, mas não menos importante, devemos configurar o acesso aos gráficos das páginas Nagios.
Edite /usr/local/nagios/etc/objects/templates. cfg e adicione o próximo texto:
action_url /pnp4nagios/index. php/graph? host=$HOSTNAME$&srv=_HOST_’ class = & # 8217; dicas & # 8217; rel = & # 8217; /pnp4nagios/index. php/popup? host = $ HOSTNAME $ & srv = _HOST_.
action_url /pnp4nagios/index. php/graph? host=$HOSTNAME$&srv=$SERVICEDESC$’ class = & # 8217; dicas & # 8217; rel = & # 8217; /pnp4nagios/index. php/popup? host = $ HOSTNAME $ & srv = $ SERVICEDESC $
Modifique os objetos de host e de serviço necessários para usar a herança com os modelos específicos do pnp4nagios criados anteriormente (use a propriedade):
ATENÇÃO: copie o próximo arquivo do diretório fonte PNP4Nagios para o diretório Nagios. É necessário mostrar gráficos POPUPS em ícones.
Reinicie nagios e ncmd (módulo pnp4nagios):
Podemos ver uma bela POPUP quando o mouse for inserido no ícone do gráfico e podemos clicar nisso para ir ao site PNP4Nagios. Bela e prática :-)
Pré-requisitos.
Check_mk GUI precisa de um módulo python apache que não esteja incluído nos repositórios distro na verdade. Pode ser instalado a partir de repositórios EPEL.
A instalação Check_mk é bastante fácil. Pode ser ainda mais fácil se você copiar o próximo texto em um arquivo chamado ". check_mk_setup. conf "na sua raiz doméstica (o primeiro nome do arquivo de caracteres é & # 8220;. & # 8221;).
# Escrito por setup of check_mk 1.2.5i2p1 em Wed Apr 16 20:10:20 CEST 2017.
Você pode instalar o check_mk sem este arquivo e responder a todas as perguntas sobre os locais do arquivo ou usar esse arquivo e alterar as entradas padrão durante a instalação. Check_mk crie e use este arquivo para saber os valores na próxima instalação ou atualização.
Quando a configuração check_mk for concluída, instale o show & # 8230; "Instalação concluída com sucesso. Reinicie o Nagios e o Apache para atualizar / atualizar as páginas da web do check_mk & # 8217 ;.
Configurar adicionar as próximas linhas no final do arquivo de configuração do Nagios "nagios. cfg".
# Load Livestatus Module.
# adicionado por setup. sh de check_mk.
Podemos testar check_mk no URI server / check_mk.
Se você encontrar algum problema com o check_mk livestatus, conecte-se ao Nagios, primeiro olhar é carga correta do livestatus no nagios. log:
Exemplo localhost em check_mk & # 8230;
Pré-requisitos.
Instalação.
As principais questões de instalação são sobre caminhos e usuários, mas a instalação da Nagvis descobre tão bons todos os valores importantes. Devemos selecionar apenas "backend mklivestatus"
| Bem-vindo ao Installer NagVis 1.8b3 |
| Este script foi criado para facilitar a instalação e atualização do NagVis |
| procedimento para você. O instalador foi testado nos seguintes sistemas: |
| & # 8211; SuSE Linux Enterprise Server 10 e 11 |
| Distribuições semelhantes às mencionadas acima também devem funcionar. |
| Isso (espero) inclui RedHat, Fedora, CentOS, OpenSuSE |
| Se você tiver algum problema usando estas ou outras distribuições, por favor |
| relatar isso à equipe do NagVis. |
| Você quer prosseguir? [y]: y.
| Iniciando a instalação do NagVis 1.8b3 |
| OS: CentOS versão 6.5 (Final) |
| Usando o gerenciador de pacotes / bin / rpm encontrados |
| Digite o caminho para o diretório base nagios [/ usr / local / nagios]:
| caminho de nagios / usr / local / nagios encontrados |
| Digite o caminho para a base NagVis [/ usr / local / nagvis]:
| Módulo PHP: gd php encontrado |
| Módulo PHP: mbstring php encontrado |
| Módulo PHP: gettext compilado_in encontrado |
| Módulo PHP: sessão compilada_in encontrada |
| Módulo PHP: compilado por xml encontrado |
| Módulo PHP: pdo php encontrado |
| Verificando Backends. (Disponível: mklivestatus, ndo2db, ido2db, merlinmy) |
| Você quer usar backend mklivestatus? [y]: y.
| Você deseja usar o backend ndo2db? [n]:
| Você quer usar o backend ido2db? [n]:
| Você quer usar o back-end Merlinmy? [n]:
| Socket Livestatus (/ usr / local / nagios / var / rw / live) encontrado |
| Módulo PHP: sockets compilados encontrados |
| Digite o caminho da web para NagVis [/ nagvis]:
| Digite o nome do usuário do servidor web [apache]:
| Digite o nome do grupo web-servidor [apache]:
| criar o arquivo de configuração Apache [y]:
| A casa de NagVis será: / usr / local / nagvis |
| O proprietário dos arquivos NagVis será: apache |
| Grupo de arquivos NagVis será: apache |
| O caminho para o comando de configuração do Apache é: /etc/httpd/conf. d |
| Configuração Apache será criada: sim |
| Modo de instalação: instalar |
| Você realmente quer continuar? [y]:
| Você pode remover com segurança este diretório de origem. |
| Para atualização / atualização posterior, você pode usar esse comando para ter uma atualização mais rápida: |
| ./install. sh - n / usr / local / nagios - p / usr / local / nagvis - l & # 8220; unix: / usr / local / nagios / var / rw / live & # 8221; - b mklivestatus - u apache - g apache - w /etc/httpd/conf. d - a y.
| & # 8211; Talvez você queira editar o arquivo de configuração principal? |
| A sua localização é: /usr/local/nagvis/etc/nagvis. ini. php |
Teste seu novo site da Nagivis: server / nagvis / (admin / admin)
Configuração e teste mínimos.
Nagvis inclui muitas amostras, muitas amostras realmente. É bom bat & # 8230; Há tanta confusão para começar com nagvis.
Primeiro é configurar o backend padrão para check_mk livestatus (live_1). Menu Opções / Gerenciar Backends / Backend Padrão → live_1 (salvar)
Vá para um mapa existente (p. e. "Demo1. Datacenter Hamburg").
Menu Editar Mapa / Opções do Mapa / caixa de listagem "backend_id" e selecione live_1 (salvar). Menu Editar Mapa / Adicionar Ícone / Host, coloque o cruzamento onde quiser no mapa, selecione como Nome do host na caixa de listação seu localhost (salvar).
Agora, podemos criar um novo mapa de nossas imagens e começar a colocar objetos nele. Ótimo!
Talvez, o melhor é apagar todos os mapas de demonstração primeiro.
Exemplo localhost passando o mouse sobre o objeto host no Nagvis MAP:
Dowload this post.
Compartilhar isso:
Deixe um comentário Cancelar resposta.
23 pensamentos sobre & ldquo; Nagios 4 (core) + Check_mk + pnp4Nagios + Nagvis & rdquo;
Eu segui todos os setpos e eu ainda estou tendo problemas com Livestatus.
Problema Livestatus: Não é possível conectar-se a & # 8216; unix: / var / log / nagios / rw / live & # 8217 ;: [Errno 13] Permissão negada.
86] Event broker module & # 8216; /usr/lib/check_mk/livestatus. o’ inicializado com sucesso.
[1404464586] Aviso: failure_prediction_enabled está obsoleto e não tem mais nenhum efeito em objetos de tipo de host (arquivo de configuração & # 8216; /usr/local/nagios/etc/check_mk. d/check_mk_templates. cfg’ ;, começando na linha 88)
[1404464586] Aviso: failure_prediction_enabled está obsoleto e não tem mais nenhum efeito em objetos de tipo de serviço (arquivo de configuração & # 8216; /usr/local/nagios/etc/check_mk. d/check_mk_templates. cfg’ ;, começando na linha 157)
[1404464586] Trabalhador de arquivo de comando com sucesso com o pid 35237.
[1404464586] TRANSMISSÃO TIMEPERIODO: 24X7; -1; 1.
[1404464586] TRANSACÇÃO DE TIMEPERIODO: 24; #; 7; -1; 1.
[1404464586] TRANSACÇÃO TIMEPERIOD: 24x7_sans_holidays; -1; 0.
[1404464586] TRANSMISSÃO DE TIMEPERIODO: nenhum; -1; 0.
[1404464586] TRANSACÇÃO TIMEPERIOD: us-holidays; -1; 0.
[1404464586] TIMEPERIOD TRANSITION: workhours; -1; 0.
O corretor Nagios parece estar funcionando ok. Existe um problema com o & # 8220; live & # 8221; arquivo ou direitos em dir de & # 8220; live & # 8221; Arquivo.
O seu arquivo de socket livestatus está em: / var / log / nagios / rw / live. O padrão de instalação CMK não está lá & # 8230;
ótimo artigo! funcionou perfeitamente.
Sam & # 8211; Isso ajuda, obtive erros semelhantes após uma reinicialização:
check_mk: Não é possível conectar-se a & # 8216; unix: / usr / local / nagios / var / rw / live & # 8217 ;: [Errno 13] Permissão negada.
nagvis: Erro (Dwoo_Exception): O diretório de compilação deve ser gravável, chmod & # 8220; / usr / local / nagvis / var / tmpl / compile / & # 8221; para gravá-lo.
O problema foi que desabilitei o SElinux ao instalar os aplicativos, mas foi apenas temporariamente. Eu tive que atualizar / etc / selinux / config para desabilitar permanentemente.
Servidor ou Problema de Configuração.
Ocorreu um problema de servidor. Você encontrará detalhes no log de erros do Apache. Um motivo possível é que o arquivo /usr/local/nagios/etc/htpasswd. users está faltando. Você pode criar esse arquivo com htpasswd ou htpasswd2. Uma solução melhor pode ser usar seu arquivo htpasswd existente da sua instalação do Nagios. Edite /etc/httpd/conf. d/check_mk e altere o caminho lá. Reinicie o Apache depois.
Estou recebendo o seguinte erro depois de adicionar o arquivo nagios. cfg.
# Load Livestatus Module.
] # tail /usr/local/nagios/var/nagios. log [1421900034] Event broker module & # 8216; /usr/lib/check_mk/livestatus. o’ desinitialized com sucesso.
[1421900034] Erro: Falha ao carregar o módulo & # 8216; /usr/lib/check_mk/livestatus. o’ ;.
[1421900034] livestatus: Livestatus 1.2.6b6 de Mathias Kettner. Socket: & # 8216; / usr / local / nagios / var / rw / live & # 8217;
[1421900034] livestatus: visite-nos em mathias-kettner. de/
[1421900034] livestatus: Sugestão: experimente OMD & # 8211; a distribuição Open Monitoring.
[1421900034] livestatus: visite OMD no omdistro.
[1421900034] livestatus: arquivo antigo sobrancelha removido removido / usr / local / nagios / var / rw / live.
[1421900034] livestatus: inicialização terminada. Outras mensagens de log vão para /usr/local/nagios/var/livestatus. log.
[1421900034] Event broker module & # 8216; /usr/lib/check_mk/livestatus. o’ inicializado com sucesso.
[1421900034] Erro: o carregamento do módulo falhou. Abortando.
Oi Ashik. É estranho. Módulo de carga mas não está carregado & # 8230;
Eu analisaria potenciais problemas com permissões nos arquivos envolvidos. Você desativou o SELINUX? Teste desabilite o SELINUX primeiro.
Olá pessoal, depois da instalação do PNP4nagios tentando se conectar ao nagiosserver / pnp4nagios, recebi esse erro de msg:
Erro fatal: Ligue para a função indefinida simplexml_load_file () em /usr/local/pnp4nagios/share/application/models/data. php na linha 270.
você tem alguma idéia de como resolver? desde já, obrigado.
? Teste se existe uma linha de uncomment no arquivo php. ini:
Instruções para ubuntu?
Eu configurei Nagios, Check_MK, PNP4 Nagios e Nagvis como por sua documentação, mas não consigo configurar o mapa usando o & # 8220; Demo1. Datacenter Hamburg & # 8221 ;.
Além disso, não obto & # 8220; Options & # 8221; botão de menu no topo. Verifiquei Nagvis em três navegadores diferentes, mas todos os navegadores mostram o mesmo resultado.
Por favor, deixe-me saber se eu sinto falta de algo.
Porque você está autorizado sob o usuário # 8220; nagios & # 8221; e a necessidade de fazer login como usuário & # 8220; admin & # 8221 ;.
Obtendo erro ao compilar o mk-livestatus usando & # 8220; make - j 4 & # 8221; comando.
TableLog. cc: Na função membro 'void TableLog :: updateLogfileIndex ()':
TableLog. cc:250:21: erro: expressão primária esperada antes de 'struct'
int len = offsetof (struct dirent, d_name)
TableLog. cc:250:36: erro: 'd_name' não foi declarado neste escopo.
int len = offsetof (struct dirent, d_name)
TableLog. cc:250:42: erro: 'offsetof' não foi declarado neste escopo.
int len = offsetof (struct dirent, d_name)
faça [2]: *** [livestatus_so-TableLog. o] Erro 1.
faça [2]: *** Esperando por trabalhos inacabados & # 8230 ;.
mv - f. deps / livestatus_so-OffsetStringMacroColumn. Tpo. deps / livestatus_so-OffsetStringMacroColumn. Po.
mv - f. deps / livestatus_so-TableContactgroups. Tpo. deps / livestatus_so-TableContactgroups. Po.
mv - f. deps / livestatus_so-OffsetStringServiceMacroColumn. Tpo. deps / livestatus_so-OffsetStringServiceMacroColumn. Po.
faça [2]: deixando o diretório `/home/Nagios_Setup/mk-livestatus-1.1.6p1/src’
faça [1]: *** [tudo-recursivo] Erro 1.
faça [1]: deixando o diretório `/home/Nagios_Setup/mk-livestatus-1.1.6p1′
make: *** [all] Erro 2.
Mk-livestatus-1.1.6. É uma versão muito antiga e # 8230; Tente com uma versão atual.
Oi, em que arquivo eu preciso adicionar o teste abaixo.
Em nagios. cfg existem parâmetros com arquivos de inclusão e / ou incluem dirs. Qualquer um deles.
Um aviso, copiei as configurações do ngag cfg para pnp4nagios deste site em vez dos arquivos originais. Depois disso, tudo funcionou, exceto os gráficos pop-up. Depois de comparar cuidadosamente as diferenças, descobri que o personagem de citação erronea foi copiado de alguma forma! Depois de mudar as qoutes para o backqoute padrão no teclado, os gráficos pop-up funcionaram.
Opções do Arquivo de Configuração Principal.
Ao criar e / ou editar arquivos de configuração, tenha em mente o seguinte:
As linhas que começam com um caractere '#' são consideradas como comentários e não são processadas. Os nomes das variáveis devem começar no início da linha; nenhum espaço em branco é permitido antes do nome; os nomes das variáveis são sensíveis a maiúsculas e minúsculas.
Dica: um exemplo de arquivo de configuração principal (/usr/local/nagios/etc/nagios. cfg) está instalado para você quando você segue o guia de instalação do quickstart.
O arquivo de configuração principal geralmente é chamado nagios. cfg e está localizado no diretório / usr / local / nagios / etc /.
Abaixo você encontrará descrições de cada opção de arquivo de configuração Nagios principal.
Esta variável especifica onde Nagios deve criar seu arquivo de log principal. Esta deve ser a primeira variável que você define no seu arquivo de configuração, pois o Nagios tentará escrever os erros encontrados no resto dos dados de configuração para este arquivo. Se você tiver rotação de registro ativada, este arquivo será girado automaticamente a cada hora, dia, semana ou mês.
Esta diretiva é usada para especificar um arquivo de configuração de objeto contendo definições de objetos que o Nagios deve usar para monitorar. Os arquivos de configuração de objeto contêm definições para hosts, grupos de host, contatos, grupos de contato, serviços, comandos, etc. Você pode separar suas informações de configuração em vários arquivos e especificar várias instruções cfg_file = para que cada uma delas seja processada.
Esta diretiva é usada para especificar um diretório que contém arquivos de configuração de objetos que o Nagios deve usar para monitorar. Todos os arquivos no diretório com uma extensão. cfg são processados como arquivos de configuração de objeto. Além disso, o Nagios irá processar recursivamente todos os arquivos de configuração em subdiretórios do diretório que você especifica aqui. Você pode separar seus arquivos de configuração em diretórios diferentes e especificar várias instruções cfg_dir = para que todos os arquivos de configuração em cada diretório sejam processados.
Esta diretiva é usada para especificar um arquivo no qual uma cópia em cache de definições de objeto deve ser armazenada. O arquivo de cache é (re) criado sempre que Nagios é (re) iniciado e é usado pelos CGIs. Destina-se a acelerar o armazenamento em cache de arquivos de configuração nos CGIs e permitir que você edite os arquivos de configuração de objeto de origem enquanto o Nagios está sendo executado sem afetar a saída exibida nos CGIs.
Esta diretiva é usada para especificar um arquivo no qual uma cópia previamente pré-processada de definições de objeto deve ser armazenada. Este arquivo pode ser usado para melhorar drasticamente os tempos de inicialização em instalações Nagios grandes / complexas. Leia mais informações sobre como acelerar as horas de início aqui.
Isso é usado para especificar um arquivo de recurso opcional que pode conter $ USERn $ macro definições. $ USERn $ macros são úteis para armazenar nomes de usuário, senhas e itens comumente usados nas definições de comando (como caminhos de diretório). Os CGIs não tentarão ler arquivos de recursos, portanto, você pode definir permissões restritivas (600 ou 660) para proteger informações confidenciais. Você pode incluir vários arquivos de recursos adicionando várias instruções de arquivo de recursos ao arquivo de configuração principal - Nagios irá processá-los todos. Veja o arquivo sample resource. cfg no subconjunto sample-config / Nagios para um exemplo de como definir $ USERn $ macros.
Este é um arquivo temporário que o Nagios cria periodicamente para atualizar dados de comentários, dados de status, etc. O arquivo é excluído quando não é mais necessário.
Este é um diretório que a Nagios pode usar como espaço de rascunho para criar arquivos temporários usados durante o processo de monitoramento. Você deve executar tmpwatch, ou um utilitário semelhante, neste diretório ocasionalmente para excluir arquivos com mais de 24 horas.
Este é o arquivo que o Nagios usa para armazenar as informações atuais de status, comentários e tempo de inatividade. Este arquivo é usado pelos CGIs para que o status de monitoramento atual possa ser relatado através de uma interface web. Os CGIs devem ter acesso de leitura a este arquivo para funcionar corretamente. Este arquivo é excluído sempre que o Nagios pára e recriado quando ele começa.
Esta configuração determina a frequência (em segundos) que o Nagios atualizará os dados de status no arquivo de status. O intervalo mínimo de atualização é de 1 segundo.
Isso é usado para configurar o usuário efetivo no qual o processo Nagios deve ser executado. After initial program startup and before starting to monitor anything, Nagios will drop its effective privileges and run as this user. You may specify either a username or a UID.
This is used to set the effective group that the Nagios process should run as. After initial program startup and before starting to monitor anything, Nagios will drop its effective privileges and run as this group. You may specify either a groupname or a GID.
This option determines whether or not Nagios will send out notifications when it initially (re)starts. If this option is disabled, Nagios will not send out notifications for any host or service. Note: If you have state retention enabled, Nagios will ignore this setting when it (re)starts and use the last known setting for this option (as stored in the state retention file), unless you disable the use_retained_program_state option. If you want to change this option when state retention is active (and the use_retained_program_state is enabled), you'll have to use the appropriate external command or change it via the web interface. Values are as follows:
0 = Disable notifications 1 = Enable notifications (default)
This option determines whether or not Nagios will execute service checks when it initially (re)starts. If this option is disabled, Nagios will not actively execute any service checks and will remain in a sort of "sleep" mode (it can still accept passive checks unless you've disabled them). This option is most often used when configuring backup monitoring servers, as described in the documentation on redundancy, or when setting up a distributed monitoring environment. Note: If you have state retention enabled, Nagios will ignore this setting when it (re)starts and use the last known setting for this option (as stored in the state retention file), unless you disable the use_retained_program_state option. If you want to change this option when state retention is active (and the use_retained_program_state is enabled), you'll have to use the appropriate external command or change it via the web interface. Values are as follows:
0 = Don't execute service checks 1 = Execute service checks (default)
This option determines whether or not Nagios will accept passive service checks when it initially (re)starts. If this option is disabled, Nagios will not accept any passive service checks. Note: If you have state retention enabled, Nagios will ignore this setting when it (re)starts and use the last known setting for this option (as stored in the state retention file), unless you disable the use_retained_program_state option. If you want to change this option when state retention is active (and the use_retained_program_state is enabled), you'll have to use the appropriate external command or change it via the web interface. Values are as follows:
0 = Don't accept passive service checks 1 = Accept passive service checks (default)
This option determines whether or not Nagios will execute on-demand and regularly scheduled host checks when it initially (re)starts. If this option is disabled, Nagios will not actively execute any host checks, although it can still accept passive host checks unless you've disabled them). This option is most often used when configuring backup monitoring servers, as described in the documentation on redundancy, or when setting up a distributed monitoring environment. Note: If you have state retention enabled, Nagios will ignore this setting when it (re)starts and use the last known setting for this option (as stored in the state retention file), unless you disable the use_retained_program_state option. If you want to change this option when state retention is active (and the use_retained_program_state is enabled), you'll have to use the appropriate external command or change it via the web interface. Values are as follows:
0 = Don't execute host checks 1 = Execute host checks (default)
This option determines whether or not Nagios will accept passive host checks when it initially (re)starts. If this option is disabled, Nagios will not accept any passive host checks. Note: If you have state retention enabled, Nagios will ignore this setting when it (re)starts and use the last known setting for this option (as stored in the state retention file), unless you disable the use_retained_program_state option. If you want to change this option when state retention is active (and the use_retained_program_state is enabled), you'll have to use the appropriate external command or change it via the web interface. Values are as follows:
0 = Don't accept passive host checks 1 = Accept passive host checks (default)
This option determines whether or not Nagios will run event handlers when it initially (re)starts. If this option is disabled, Nagios will not run any host or service event handlers. Note: If you have state retention enabled, Nagios will ignore this setting when it (re)starts and use the last known setting for this option (as stored in the state retention file), unless you disable the use_retained_program_state option. If you want to change this option when state retention is active (and the use_retained_program_state is enabled), you'll have to use the appropriate external command or change it via the web interface. Values are as follows:
0 = Disable event handlers 1 = Enable event handlers (default)
This is the rotation method that you would like Nagios to use for your log file. Values are as follows:
n = None (don't rotate the log - this is the default) h = Hourly (rotate the log at the top of each hour) d = Daily (rotate the log at midnight each day) w = Weekly (rotate the log at midnight on Saturday) m = Monthly (rotate the log at midnight on the last day of the month)
This is the directory where Nagios should place log files that have been rotated. This option is ignored if you choose to not use the log rotation functionality.
This option determines whether or not Nagios will check the command file for commands that should be executed. This option must be enabled if you plan on using the command CGI to issue commands via the web interface. More information on external commands can be found here.
0 = Don't check external commands 1 = Check external commands (default)
If you specify a number with an "s" appended to it (i. e. 30s), this is the number of seconds to wait between external command checks. If you leave off the "s", this is the number of "time units" to wait between external command checks. Unless you've changed the interval_length value (as defined below) from the default value of 60, this number will mean minutes.
Note: By setting this value to -1 , Nagios will check for external commands as often as possible. Each time Nagios checks for external commands it will read and process all commands present in the command file before continuing on with its other duties. More information on external commands can be found here.
This is the file that Nagios will check for external commands to process. The command CGI writes commands to this file. The external command file is implemented as a named pipe (FIFO), which is created when Nagios starts and removed when it shuts down. If the file exists when Nagios starts, the Nagios process will terminate with an error message. More information on external commands can be found here.
Note: This is an advanced feature. This option determines how many buffer slots Nagios will reserve for caching external commands that have been read from the external command file by a worker thread, but have not yet been processed by the main thread of the Nagios deamon. Each slot can hold one external command, so this option essentially determines how many commands can be buffered. For installations where you process a large number of passive checks (e. g. distributed setups), you may need to increase this number. You should consider using MRTG to graph Nagios' usage of external command buffers. You can read more on how to configure graphing here.
This option determines whether Nagios will automatically check to see if new updates (releases) are available. It is recommend that you enable this option to ensure that you stay on top of the latest critical patches to Nagios. Nagios is critical to you - make sure you keep it in good shape. Nagios will check once a day for new updates. Data collected by Nagios Enterprises from the update check is processed in accordance with our privacy policy - see api. nagios for details.
This option deterines what data Nagios will send to api. nagios when it checks for updates. By default, Nagios will send information on the current version of Nagios you have installed, as well as an indicator as to whether this was a new installation or not. Nagios Enterprises uses this data to determine the number of users running specific version of Nagios. Enable this option if you do not wish for this information to be sent.
This option specifies the location of the lock file that Nagios should create when it runs as a daemon (when started with the - d command line argument). This file contains the process id (PID) number of the running Nagios process.
This option determines whether or not Nagios will retain state information for hosts and services between program restarts. If you enable this option, you should supply a value for the state_retention_file variable. When enabled, Nagios will save all state information for hosts and service before it shuts down (or restarts) and will read in previously saved state information when it starts up again.
0 = Don't retain state information 1 = Retain state information (default)
This is the file that Nagios will use for storing status, downtime, and comment information before it shuts down. When Nagios is restarted it will use the information stored in this file for setting the initial states of services and hosts before it starts monitoring anything. In order to make Nagios retain state information between program restarts, you must enable the retain_state_information option.
This setting determines how often (in minutes) that Nagios will automatically save retention data during normal operation. If you set this value to 0, Nagios will not save retention data at regular intervals, but it will still save retention data before shutting down or restarting. If you have disabled state retention (with the retain_state_information option), this option has no effect.
This setting determines whether or not Nagios will set various program-wide state variables based on the values saved in the retention file. Some of these program-wide state variables that are normally saved across program restarts if state retention is enabled include the enable_notifications, enable_flap_detection, enable_event_handlers, execute_service_checks, and accept_passive_service_checks options. If you do not have state retention enabled, this option has no effect.
0 = Don't use retained program state 1 = Use retained program state (default)
This setting determines whether or not Nagios will retain scheduling info (next check times) for hosts and services when it restarts. If you are adding a large number (or percentage) of hosts and services, I would recommend disabling this option when you first restart Nagios, as it can adversely skew the spread of initial checks. Otherwise you will probably want to leave it enabled.
0 = Don't use retained scheduling info 1 = Use retained scheduling info (default)
WARNING: This is an advanced feature. You'll need to read the Nagios source code to use this option effectively.
These options determine which host or service attributes are NOT retained across program restarts. The values for these options are a bitwise AND of values specified by the "MODATTR_" definitions in the include/common. h source code file. By default, all host and service attributes are retained.
WARNING: This is an advanced feature. You'll need to read the Nagios source code to use this option effectively.
These options determine which process attributes are NOT retained across program restarts. There are two masks because there are often separate host and service process attributes that can be changed. For example, host checks can be disabled at the program level, while service checks are still enabled. The values for these options are a bitwise AND of values specified by the "MODATTR_" definitions in the include/common. h source code file. By default, all process attributes are retained.
WARNING: This is an advanced feature. You'll need to read the Nagios source code to use this option effectively.
These options determine which contact attributes are NOT retained across program restarts. There are two masks because there are often separate host and service contact attributes that can be changed. The values for these options are a bitwise AND of values specified by the "MODATTR_" definitions in the include/common. h source code file. By default, all process attributes are retained.
This variable determines whether messages are logged to the syslog facility on your local host. Values are as follows:
0 = Don't use syslog facility 1 = Use syslog facility.
This variable determines whether or not notification messages are logged. If you have a lot of contacts or regular service failures your log file will grow relatively quickly. Use this option to keep contact notifications from being logged.
0 = Don't log notifications 1 = Log notifications.
This variable determines whether or not service check retries are logged. Service check retries occur when a service check results in a non-OK state, but you have configured Nagios to retry the service more than once before responding to the error. Services in this situation are considered to be in "soft" states. Logging service check retries is mostly useful when attempting to debug Nagios or test out service event handlers.
0 = Don't log service check retries 1 = Log service check retries.
This variable determines whether or not host check retries are logged. Logging host check retries is mostly useful when attempting to debug Nagios or test out host event handlers.
0 = Don't log host check retries 1 = Log host check retries.
This variable determines whether or not service and host event handlers are logged. Event handlers are optional commands that can be run whenever a service or hosts changes state. Logging event handlers is most useful when debugging Nagios or first trying out your event handler scripts.
0 = Don't log event handlers 1 = Log event handlers.
This variable determines whether or not Nagios will force all initial host and service states to be logged, even if they result in an OK state. Initial service and host states are normally only logged when there is a problem on the first check. Enabling this option is useful if you are using an application that scans the log file to determine long-term state statistics for services and hosts.
0 = Don't log initial states (default) 1 = Log initial states.
This variable determines whether or not Nagios will log external commands that it receives from the external command file. Note: This option does not control whether or not passive service checks (which are a type of external command) get logged. To enable or disable logging of passive checks, use the log_passive_checks option.
0 = Don't log external commands 1 = Log external commands (default)
This variable determines whether or not Nagios will log passive host and service checks that it receives from the external command file. If you are setting up a distributed monitoring environment or plan on handling a large number of passive checks on a regular basis, you may wish to disable this option so your log file doesn't get too large.
0 = Don't log passive checks 1 = Log passive checks (default)
This option allows you to specify a host event handler command that is to be run for every host state change. The global event handler is executed immediately prior to the event handler that you have optionally specified in each host definition. The command argument is the short name of a command that you define in your object configuration file. The maximum amount of time that this command can run is controlled by the event_handler_timeout option. More information on event handlers can be found here.
This option allows you to specify a service event handler command that is to be run for every service state change. The global event handler is executed immediately prior to the event handler that you have optionally specified in each service definition. The command argument is the short name of a command that you define in your object configuration file. The maximum amount of time that this command can run is controlled by the event_handler_timeout option. More information on event handlers can be found here.
This is the number of seconds that Nagios will sleep before checking to see if the next service or host check in the scheduling queue should be executed. Note that Nagios will only sleep after it "catches up" with queued service checks that have fallen behind.
This option allows you to control how service checks are initially "spread out" in the event queue. Using a "smart" delay calculation (the default) will cause Nagios to calculate an average check interval and spread initial checks of all services out over that interval, thereby helping to eliminate CPU load spikes. Using no delay is generally not recommended, as it will cause all service checks to be scheduled for execution at the same time. This means that you will generally have large CPU spikes when the services are all executed in parallel. More information on how to estimate how the inter-check delay affects service check scheduling can be found here. Values are as follows:
n = Don't use any delay - schedule all service checks to run immediately (i. e. at the same time!) d = Use a "dumb" delay of 1 second between service checks s = Use a "smart" delay calculation to spread service checks out evenly (default) x. xx = Use a user-supplied inter-check delay of x. xx seconds.
This option determines the maximum number of minutes from when Nagios starts that all services (that are scheduled to be regularly checked) are checked. This option will automatically adjust the service inter-check delay method (if necessary) to ensure that the initial checks of all services occur within the timeframe you specify. In general, this option will not have an affect on service check scheduling if scheduling information is being retained using the use_retained_scheduling_info option. Default value is 30 (minutes).
This variable determines how service checks are interleaved. Interleaving allows for a more even distribution of service checks, reduced load on remote hosts, and faster overall detection of host problems. Setting this value to 1 is equivalent to not interleaving the service checks (this is how versions of Nagios previous to 0.0.5 worked). Set this value to s (smart) for automatic calculation of the interleave factor unless you have a specific reason to change it. The best way to understand how interleaving works is to watch the status CGI (detailed view) when Nagios is just starting. You should see that the service check results are spread out as they begin to appear. More information on how interleaving works can be found here. x = A number greater than or equal to 1 that specifies the interleave factor to use. An interleave factor of 1 is equivalent to not interleaving the service checks. s = Use a "smart" interleave factor calculation (default)
This option allows you to specify the maximum number of service checks that can be run in parallel at any given time. Specifying a value of 1 for this variable essentially prevents any service checks from being run in parallel. Specifying a value of 0 (the default) does not place any restrictions on the number of concurrent checks. You'll have to modify this value based on the system resources you have available on the machine that runs Nagios, as it directly affects the maximum load that will be imposed on the system (processor utilization, memory, etc.). More information on how to estimate how many concurrent checks you should allow can be found here.
This option allows you to control the frequency in seconds of check result "reaper" events. "Reaper" events process the results from host and service checks that have finished executing. These events consitute the core of the monitoring logic in Nagios.
This option allows you to control the maximum amount of time in seconds that host and service check result "reaper" events are allowed to run. "Reaper" events process the results from host and service checks that have finished executing. If there are a lot of results to process, reaper events may take a long time to finish, which might delay timely execution of new host and service checks. This variable allows you to limit the amount of time that an individual reaper event will run before it hands control back over to Nagios for other portions of the monitoring logic.
This options determines which directory Nagios will use to temporarily store host and service check results before they are processed. This directory should not be used to store any other files, as Nagios will periodically clean this directory of old file (see the max_check_result_file_age option for more information).
Note: Make sure that only a single instance of Nagios has access to the check result path. If multiple instances of Nagios have their check result path set to the same directory, you will run into problems with check results being processed (incorrectly) by the wrong instance of Nagios!
This options determines the maximum age in seconds that Nagios will consider check result files found in the check_result_path directory to be valid. Check result files that are older that this threshold will be deleted by Nagios and the check results they contain will not be processed. By using a value of zero (0) with this option, Nagios will process all check result files - even if they're older than your hardware :-).
This option allows you to control how host checks that are scheduled to be checked on a regular basis are initially "spread out" in the event queue. Using a "smart" delay calculation (the default) will cause Nagios to calculate an average check interval and spread initial checks of all hosts out over that interval, thereby helping to eliminate CPU load spikes. Using no delay is generally not recommended. Using no delay will cause all host checks to be scheduled for execution at the same time. More information on how to estimate how the inter-check delay affects host check scheduling can be found here. Values are as follows:
n = Don't use any delay - schedule all host checks to run immediately (i. e. at the same time!) d = Use a "dumb" delay of 1 second between host checks s = Use a "smart" delay calculation to spread host checks out evenly (default) x. xx = Use a user-supplied inter-check delay of x. xx seconds.
This option determines the maximum number of minutes from when Nagios starts that all hosts (that are scheduled to be regularly checked) are checked. This option will automatically adjust the host inter-check delay method (if necessary) to ensure that the initial checks of all hosts occur within the timeframe you specify. In general, this option will not have an affect on host check scheduling if scheduling information is being retained using the use_retained_scheduling_info option. Default value is 30 (minutes).
This is the number of seconds per "unit interval" used for timing in the scheduling queue, re-notifications, etc. "Units intervals" are used in the object configuration file to determine how often to run a service check, how often to re-notify a contact, etc.
Important: The default value for this is set to 60, which means that a "unit value" of 1 in the object configuration file will mean 60 seconds (1 minute). I have not really tested other values for this variable, so proceed at your own risk if you decide to do so!
This option determines whether or not Nagios will attempt to automatically reschedule active host and service checks to "smooth" them out over time. This can help to balance the load on the monitoring server, as it will attempt to keep the time between consecutive checks consistent, at the expense of executing checks on a more rigid schedule.
WARNING: THIS IS AN EXPERIMENTAL FEATURE AND MAY BE REMOVED IN FUTURE VERSIONS. ENABLING THIS OPTION CAN DEGRADE PERFORMANCE - RATHER THAN INCREASE IT - IF USED IMPROPERLY!
This option determines how often (in seconds) Nagios will attempt to automatically reschedule checks. This option only has an effect if the auto_reschedule_checks option is enabled. Default is 30 seconds.
WARNING: THIS IS AN EXPERIMENTAL FEATURE AND MAY BE REMOVED IN FUTURE VERSIONS. ENABLING THE AUTO-RESCHEDULING OPTION CAN DEGRADE PERFORMANCE - RATHER THAN INCREASE IT - IF USED IMPROPERLY!
This option determines the "window" of time (in seconds) that Nagios will look at when automatically rescheduling checks. Only host and service checks that occur in the next X seconds (determined by this variable) will be rescheduled. This option only has an effect if the auto_reschedule_checks option is enabled. Default is 180 seconds (3 minutes).
WARNING: THIS IS AN EXPERIMENTAL FEATURE AND MAY BE REMOVED IN FUTURE VERSIONS. ENABLING THE AUTO-RESCHEDULING OPTION CAN DEGRADE PERFORMANCE - RATHER THAN INCREASE IT - IF USED IMPROPERLY!
Nagios tries to be smart about how and when it checks the status of hosts. In general, disabling this option will allow Nagios to make some smarter decisions and check hosts a bit faster. Enabling this option will increase the amount of time required to check hosts, but may improve reliability a bit. Unless you have problems with Nagios not recognizing that a host recovered, I would suggest not enabling this option.
0 = Don't use aggressive host checking (default) 1 = Use aggressive host checking.
This option determines whether or not Nagios will translate DOWN/UNREACHABLE passive host check results to their "correct" state from the viewpoint of the local Nagios instance. This can be very useful in distributed and failover monitoring installations. More information on passive check state translation can be found here.
0 = Disable check translation (default) 1 = Enable check translation.
This option determines whether or not Nagios will treat passive host checks as HARD states or SOFT states. By default, a passive host check result will put a host into a HARD state type. You can change this behavior by enabling this option.
0 = Passive host checks are HARD (default) 1 = Passive host checks are SOFT.
This option determines whether or not Nagios will execute predictive checks of hosts that are being depended upon (as defined in host dependencies) for a particular host when it changes state. Predictive checks help ensure that the dependency logic is as accurate as possible. More information on how predictive checks work can be found here.
0 = Disable predictive checks 1 = Enable predictive checks (default)
This option determines whether or not Nagios will execute predictive checks of services that are being depended upon (as defined in service dependencies) for a particular service when it changes state. Predictive checks help ensure that the dependency logic is as accurate as possible. More information on how predictive checks work can be found here.
0 = Disable predictive checks 1 = Enable predictive checks (default)
This option determines the maximum amount of time (in seconds) that the state of a previous host check is considered current. Cached host states (from host checks that were performed more recently than the time specified by this value) can improve host check performance immensely. Too high of a value for this option may result in (temporarily) inaccurate host states, while a low value may result in a performance hit for host checks. Use a value of 0 if you want to disable host check caching. More information on cached checks can be found here.
This option determines the maximum amount of time (in seconds) that the state of a previous service check is considered current. Cached service states (from service checks that were performed more recently than the time specified by this value) can improve service check performance when a lot of service dependencies are used. Too high of a value for this option may result in inaccuracies in the service dependency logic. Use a value of 0 if you want to disable service check caching. More information on cached checks can be found here.
This option determines whether or not the Nagios daemon will take several shortcuts to improve performance. These shortcuts result in the loss of a few features, but larger installations will likely see a lot of benefit from doing so. More information on what optimizations are taken when you enable this option can be found here.
This option determines whether or not Nagios will free memory in child processes when they are fork()ed off from the main process. By default, Nagios frees memory. However, if the use_large_installation_tweaks option is enabled, it will not. By defining this option in your configuration file, you are able to override things to get the behavior you want.
This option determines whether or not Nagios will fork() child processes twice when it executes host and service checks. By default, Nagios fork()s twice. However, if the use_large_installation_tweaks option is enabled, it will only fork() once. By defining this option in your configuration file, you are able to override things to get the behavior you want.
This option determines whether or not the Nagios daemon will make all standard macros available as environment variables to your check, notification, event hander, etc. commands. In large Nagios installations this can be problematic because it takes additional memory and (more importantly) CPU to compute the values of all macros and make them available to the environment.
0 = Don't make macros available as environment variables 1 = Make macros available as environment variables (default)
This option determines whether or not Nagios will try and detect hosts and services that are "flapping". Flapping occurs when a host or service changes between states too frequently, resulting in a barrage of notifications being sent out. When Nagios detects that a host or service is flapping, it will temporarily suppress notifications for that host/service until it stops flapping. Flap detection is very experimental at this point, so use this feature with caution! More information on how flap detection and handling works can be found here. Note: If you have state retention enabled, Nagios will ignore this setting when it (re)starts and use the last known setting for this option (as stored in the state retention file), unless you disable the use_retained_program_state option. If you want to change this option when state retention is active (and the use_retained_program_state is enabled), you'll have to use the appropriate external command or change it via the web interface.
0 = Don't enable flap detection (default) 1 = Enable flap detection.
This option is used to set the low threshold for detection of service flapping. For more information on how flap detection and handling works (and how this option affects things) read this.
This option is used to set the high threshold for detection of service flapping. For more information on how flap detection and handling works (and how this option affects things) read this.
This option is used to set the low threshold for detection of host flapping. For more information on how flap detection and handling works (and how this option affects things) read this.
This option is used to set the high threshold for detection of host flapping. For more information on how flap detection and handling works (and how this option affects things) read this.
This option determines whether or not Nagios will use soft state information when checking host and service dependencies. Normally Nagios will only use the latest hard host or service state when checking dependencies. If you want it to use the latest state (regardless of whether its a soft or hard state type), enable this option.
0 = Don't use soft state dependencies (default) 1 = Use soft state dependencies.
This is the maximum number of seconds that Nagios will allow service checks to run. If checks exceed this limit, they are killed and a CRITICAL state is returned. A timeout error will also be logged.
There is often widespread confusion as to what this option really does. It is meant to be used as a last ditch mechanism to kill off plugins which are misbehaving and not exiting in a timely manner. It should be set to something high (like 60 seconds or more), so that each service check normally finishes executing within this time limit. If a service check runs longer than this limit, Nagios will kill it off thinking it is a runaway processes.
This setting determines the state Nagios will report when a service check times out - that is does not respond within service_check_timeout seconds. This can be useful if a machine is running at too high a load and you do not want to consider a failed service check to be critical (the default).
Valid settings are:
This is the maximum number of seconds that Nagios will allow host checks to run. If checks exceed this limit, they are killed and a CRITICAL state is returned and the host will be assumed to be DOWN. A timeout error will also be logged.
There is often widespread confusion as to what this option really does. It is meant to be used as a last ditch mechanism to kill off plugins which are misbehaving and not exiting in a timely manner. It should be set to something high (like 60 seconds or more), so that each host check normally finishes executing within this time limit. If a host check runs longer than this limit, Nagios will kill it off thinking it is a runaway processes.
This is the maximum number of seconds that Nagios will allow event handlers to be run. If an event handler exceeds this time limit it will be killed and a warning will be logged.
There is often widespread confusion as to what this option really does. It is meant to be used as a last ditch mechanism to kill off commands which are misbehaving and not exiting in a timely manner. It should be set to something high (like 60 seconds or more), so that each event handler command normally finishes executing within this time limit. If an event handler runs longer than this limit, Nagios will kill it off thinking it is a runaway processes.
This is the maximum number of seconds that Nagios will allow notification commands to be run. If a notification command exceeds this time limit it will be killed and a warning will be logged.
There is often widespread confusion as to what this option really does. It is meant to be used as a last ditch mechanism to kill off commands which are misbehaving and not exiting in a timely manner. It should be set to something high (like 60 seconds or more), so that each notification command finishes executing within this time limit. If a notification command runs longer than this limit, Nagios will kill it off thinking it is a runaway processes.
This is the maximum number of seconds that Nagios will allow an obsessive compulsive service processor command to be run. If a command exceeds this time limit it will be killed and a warning will be logged.
This is the maximum number of seconds that Nagios will allow an obsessive compulsive host processor command to be run. If a command exceeds this time limit it will be killed and a warning will be logged.
This is the maximum number of seconds that Nagios will allow a host performance data processor command or service performance data processor command to be run. If a command exceeds this time limit it will be killed and a warning will be logged.
This value determines whether or not Nagios will "obsess" over service checks results and run the obsessive compulsive service processor command you define. I know - funny name, but it was all I could think of. This option is useful for performing distributed monitoring. If you're not doing distributed monitoring, don't enable this option.
0 = Don't obsess over services (default) 1 = Obsess over services.
This option allows you to specify a command to be run after every service check, which can be useful in distributed monitoring. This command is executed after any event handler or notification commands. The command argument is the short name of a command definition that you define in your object configuration file. The maximum amount of time that this command can run is controlled by the ocsp_timeout option. More information on distributed monitoring can be found here. This command is only executed if the obsess_over_services option is enabled globally and if the obsess_over_service directive in the service definition is enabled.
This value determines whether or not Nagios will "obsess" over host checks results and run the obsessive compulsive host processor command you define. I know - funny name, but it was all I could think of. This option is useful for performing distributed monitoring. If you're not doing distributed monitoring, don't enable this option.
0 = Don't obsess over hosts (default) 1 = Obsess over hosts.
This option allows you to specify a command to be run after every host check, which can be useful in distributed monitoring. This command is executed after any event handler or notification commands. The command argument is the short name of a command definition that you define in your object configuration file. The maximum amount of time that this command can run is controlled by the ochp_timeout option. More information on distributed monitoring can be found here. This command is only executed if the obsess_over_hosts option is enabled globally and if the obsess_over_host directive in the host definition is enabled.
This value determines whether or not Nagios will process host and service check performance data.
0 = Don't process performance data (default) 1 = Process performance data.
This option allows you to specify a command to be run after every host check to process host performance data that may be returned from the check. The command argument is the short name of a command definition that you define in your object configuration file. This command is only executed if the process_performance_data option is enabled globally and if the process_perf_data directive in the host definition is enabled.
This option allows you to specify a command to be run after every service check to process service performance data that may be returned from the check. The command argument is the short name of a command definition that you define in your object configuration file. This command is only executed if the process_performance_data option is enabled globally and if the process_perf_data directive in the service definition is enabled.
This option allows you to specify a file to which host performance data will be written after every host check. Data will be written to the performance file as specified by the host_perfdata_file_template option. Performance data is only written to this file if the process_performance_data option is enabled globally and if the process_perf_data directive in the host definition is enabled.
This option allows you to specify a file to which service performance data will be written after every service check. Data will be written to the performance file as specified by the service_perfdata_file_template option. Performance data is only written to this file if the process_performance_data option is enabled globally and if the process_perf_data directive in the service definition is enabled.
This option determines what (and how) data is written to the host performance data file. The template may contain macros, special characters (\t for tab, \r for carriage return, \n for newline) and plain text. A newline is automatically added after each write to the performance data file.
This option determines what (and how) data is written to the service performance data file. The template may contain macros, special characters (\t for tab, \r for carriage return, \n for newline) and plain text. A newline is automatically added after each write to the performance data file.
This option determines how the host performance data file is opened. Unless the file is a named pipe you'll probably want to use the default mode of append.
a = Open file in append mode (default) w = Open file in write mode p = Open in non-blocking read/write mode (useful when writing to pipes)
This option determines how the service performance data file is opened. Unless the file is a named pipe you'll probably want to use the default mode of append.
a = Open file in append mode (default) w = Open file in write mode p = Open in non-blocking read/write mode (useful when writing to pipes)
This option allows you to specify the interval (in seconds) at which the host performance data file is processed using the host performance data file processing command. A value of 0 indicates that the performance data file should not be processed at regular intervals.
This option allows you to specify the interval (in seconds) at which the service performance data file is processed using the service performance data file processing command. A value of 0 indicates that the performance data file should not be processed at regular intervals.
This option allows you to specify the command that should be executed to process the host performance data file. The command argument is the short name of a command definition that you define in your object configuration file. The interval at which this command is executed is determined by the host_perfdata_file_processing_interval directive.
This option allows you to specify the command that should be executed to process the service performance data file. The command argument is the short name of a command definition that you define in your object configuration file. The interval at which this command is executed is determined by the service_perfdata_file_processing_interval directive.
This option allows you to enable or disable checks for orphaned service checks. Orphaned service checks are checks which have been executed and have been removed from the event queue, but have not had any results reported in a long time. Since no results have come back in for the service, it is not rescheduled in the event queue. This can cause service checks to stop being executed. Normally it is very rare for this to happen - it might happen if an external user or process killed off the process that was being used to execute a service check. If this option is enabled and Nagios finds that results for a particular service check have not come back, it will log an error message and reschedule the service check. If you start seeing service checks that never seem to get rescheduled, enable this option and see if you notice any log messages about orphaned services.
0 = Don't check for orphaned service checks 1 = Check for orphaned service checks (default)
This option allows you to enable or disable checks for orphaned hoste checks. Orphaned host checks are checks which have been executed and have been removed from the event queue, but have not had any results reported in a long time. Since no results have come back in for the host, it is not rescheduled in the event queue. This can cause host checks to stop being executed. Normally it is very rare for this to happen - it might happen if an external user or process killed off the process that was being used to execute a host check. If this option is enabled and Nagios finds that results for a particular host check have not come back, it will log an error message and reschedule the host check. If you start seeing host checks that never seem to get rescheduled, enable this option and see if you notice any log messages about orphaned hosts.
0 = Don't check for orphaned host checks 1 = Check for orphaned host checks (default)
This option determines whether or not Nagios will periodically check the "freshness" of service checks. Enabling this option is useful for helping to ensure that passive service checks are received in a timely manner. More information on freshness checking can be found here.
0 = Don't check service freshness 1 = Check service freshness (default)
This setting determines how often (in seconds) Nagios will periodically check the "freshness" of service check results. If you have disabled service freshness checking (with the check_service_freshness option), this option has no effect. More information on freshness checking can be found here.
This option determines whether or not Nagios will periodically check the "freshness" of host checks. Enabling this option is useful for helping to ensure that passive host checks are received in a timely manner. More information on freshness checking can be found here.
0 = Don't check host freshness 1 = Check host freshness (default)
This setting determines how often (in seconds) Nagios will periodically check the "freshness" of host check results. If you have disabled host freshness checking (with the check_host_freshness option), this option has no effect. More information on freshness checking can be found here.
This option determines the number of seconds Nagios will add to any host or services freshness threshold it automatically calculates (e. g. those not specified explicity by the user). More information on freshness checking can be found here.
This setting determines whether or not the embedded Perl interpreter is enabled on a program-wide basis. Nagios must be compiled with support for embedded Perl for this option to have an effect. More information on the embedded Perl interpreter can be found here.
This setting determines whether or not the embedded Perl interpreter should be used for Perl plugins/scripts that do not explicitly enable/disable it. Nagios must be compiled with support for embedded Perl for this option to have an effect. More information on the embedded Perl interpreter and the effect of this setting can be found here.
This option allows you to specify what kind of date/time format Nagios should use in the web interface and date/time macros. Possible options (along with example output) include:
This option allows you to override the default timezone that this instance of Nagios runs in. Useful if you have multiple instances of Nagios that need to run from the same server, but have different local times associated with them. If not specified, Nagios will use the system configured timezone.
Note: If you use this option to specify a custom timezone, you will also need to alter the Apache configuration directives for the CGIs to specify the timezone you want. Exemplo:
SetEnv TZ "US/Mountain"
This option allows you to specify illegal characters that cannot be used in host names, service descriptions, or names of other object types. Nagios will allow you to use most characters in object definitions, but I recommend not using the characters shown in the example above. Doing may give you problems in the web interface, notification commands, etc.
This option allows you to specify illegal characters that should be stripped from macros before being used in notifications, event handlers, and other commands. This DOES NOT affect macros used in service or host check commands. You can choose to not strip out the characters shown in the example above, but I recommend you do not do this. Some of these characters are interpreted by the shell (i. e. the backtick) and can lead to security problems. The following macros are stripped of the characters you specify:
$HOSTOUTPUT$ , $HOSTPERFDATA$ , $HOSTACKAUTHOR$ , $HOSTACKCOMMENT$ , $SERVICEOUTPUT$ , $SERVICEPERFDATA$ , $SERVICEACKAUTHOR$ , and $SERVICEACKCOMMENT$
This option determines whether or not various directives in your object definitions will be processed as regular expressions. More information on how this works can be found here.
0 = Don't use regular expression matching (default) 1 = Use regular expression matching.
If you've enabled regular expression matching of various object directives using the use_regexp_matching option, this option will determine when object directives are treated as regular expressions. If this option is disabled (the default), directives will only be treated as regular expressions if they contain * , ? , + , or \. . If this option is enabled, all appropriate directives will be treated as regular expression - be careful when enabling this! More information on how this works can be found here.
0 = Don't use true regular expression matching (default) 1 = Use true regular expression matching.
This is the email address for the administrator of the local machine (i. e. the one that Nagios is running on). This value can be used in notification commands by using the $ADMINEMAIL$ macro.
This is the pager number (or pager email gateway) for the administrator of the local machine (i. e. the one that Nagios is running on). The pager number/address can be used in notification commands by using the $ADMINPAGER$ macro.
This option controls what (if any) data gets sent to the event broker and, in turn, to any loaded event broker modules. This is an advanced option. When in doubt, either broker nothing (if not using event broker modules) or broker everything (if using event broker modules). Possible values are shown below.
0 = Broker nothing -1 = Broker everything # = See BROKER_* definitions in source code (include/broker. h) for other values that can be OR'ed together.
This directive is used to specify an event broker module that should by loaded by Nagios at startup. Use multiple directives if you want to load more than one module. Arguments that should be passed to the module at startup are seperated from the module path by a space.
Do NOT overwrite modules while they are being used by Nagios or Nagios will crash in a fiery display of SEGFAULT glory. This is a bug/limitation either in dlopen(), the kernel, and/or the filesystem. And maybe Nagios.
The correct/safe way of updating a module is by using one of these methods:
Shutdown Nagios, replace the module file, restart Nagios While Nagios is running. delete the original module file, move the new module file into place, restart Nagios.
This option determines where Nagios should write debugging information. What (if any) information is written is determined by the debug_level and debug_verbosity options. You can have Nagios automaticaly rotate the debug file when it reaches a certain size by using the max_debug_file_size option.
This option determines what type of information Nagios should write to the debug_file. This value is a logical OR of the values below.
-1 = Log everything 0 = Log nothing (default) 1 = Function enter/exit information 2 = Config information 4 = Process information 8 = Scheduled event information 16 = Host/service check information 32 = Notification information 64 = Event broker information.
This option determines how much debugging information Nagios should write to the debug_file.
0 = Basic information 1 = More detailed information (default) 2 = Highly detailed information.
This option determines the maximum size (in bytes) of the debug file. If the file grows larger than this size, it will be renamed with a. old extension. If a file already exists with a. old extension it will automatically be deleted. This helps ensure your disk space usage doesn't get out of control when debugging Nagios.
This boolean option determines whether services, service dependencies, or host dependencies assigned to empty host groups (host groups with no host members) will cause Nagios to exit with error on start up (or during a configuration check) or not. The default behavior if the option is not present in the main configuration file is for Nagios to exit with error if any of these objects are associated with host groups that have no hosts associated with them. Enabling this option can be useful when:
MK Livestatus.
1. How to access Nagios status data.
1.1. Accessing status data today.
The classical way of accessing the current status of your hosts and services is by reading and parsing the file status. dat , which is created by Nagios on a regular basis. The update interval is configured via status_update_interval in nagios. cfg . A typical value is 10 seconds. If your installation is getting larger, you might have to increase this value in order to minimize CPU usage and disk IO. The nagios web interface uses status. dat for displaying its data.
Parsing status. dat is not very popular amongst developers of addons. So many use another approach: NDO. This is a NEB module that is loaded directly into the Nagios process and sends out all status updates via a UNIX socket to a helper process. The helper process creates SQL statements and updates various tables in a MySQL or PostgreSQL database. This approach has several advantages over status. dat :
The data is updated immediatley, not only every 10 or 20 seconds. Applications have easy access to the data via SQL. No parser for status. dat is needed. In large installations the access for the addons to the data is faster then reading status. dat .
Unfortunately, however, NDO has also some severe shortcomings:
It has a complex setup. It needs a (rapidly growing) database to be administered. It eats up a significant portion of your CPU ressources, just in order to keep the database up-todate. Regular housekeeping of the database can hang your Nagios for minutes or even an hour once day.
1.2. O futuro.
Since version 1.1.0, Check_MK offers a completely new approach for accessing status and also historic data: Livestatus . Just as NDO, Livestatus make use of the Nagios Event Broker API and loads a binary module into your Nagios process. But other then NDO, Livestatus does not actively write out data. Instead, it opens a socket by which data can be retrieved on demand.
The socket allows you to send a request for hosts, services or other pieces of data and get an immediate answer. The data is directly read from Nagios' internal data structures. Livestatus does not create its own copy of that data. Beginning from version 1.1.2 you are also able retrieve historic data from the Nagios log files via Livestatus.
This is not only a stunningly simple approach, but also an extremely fast one. Some advantages are:
Unlike NDO, using Livestatus imposes no measurable burden on your CPU at all. Just when processing queries a very small amount of CPU is needed. But that will not even block Nagios. Livestatus produces zero disk IO when quering status data. Accessing the data is much faster then parsing status. dat or querying an SQL database. No configuration is needed, No database is needed. No administration is neccessary. Livestatus scales well to large installations, even beyond 50.000 services. Livestatus gives you access to Nagios-specific data not available to any other available status access method - for example the information whether a host is currently in its notification period.
At the same time, Livestatus provides its own query language that is simple to understand, offers most of the flexibility of SQL and even more in some cases. It's protocol is fast, light-weight and does not need a binary client. You can even get access from the shell without any helper software.
1.3. The Present.
Livestatus is still a young technology, but already many addons support Livestatus as data source or even propose it as their default. Here is an (incomplete) list of addons with Livestatus support: NagVis - nagvis NagiosBP - bp-addon. monitoringexchange Thruk - thruk CoffeeSaint - vanheusden/java/CoffeeSaint/ RealOpInsight - realopinsight and of course: Check_MK Multisite!
Please mail us if you think this list is incomplete.
2. Setting up and using Livestatus.
2.1. Automatic setup.
The typical way to setup Livestatus is just to answer yes when asked by the Check_mk setup. You need to you have all tools installed that are needed to compile C++ programs. These are at least: The GNU C++ compiler (packaged as g++ in Debian) The utility make (packaged as make ) The development files for the libc ( libc6-dev ) The development files for the C++ standard library ( libstdc++6-dev )
The script setup. sh compiles a module called livestatus. o and copies it into /usr/lib/check_mk (if you didn't change that path). It also adds two lines to your nagios. cfg , which are needed for loading the module. After that you just need to restart Nagios and a Unix socket with the name live should appear in the same directory as you Nagios command pipe.
2.2. Manual setup.
There are several situations in which a manual setup is preferable, for example: If you do not want to use Check_MK, but just Livestatus If the automatic setup does not work correctly (which is unlikely but not impossible). If you want to make changes to the source code of Livestatus.
For manually setting up Livestatus, you can download the source code independent of Check_MK at the download page. Unpack the tarball at a convenient place and change to the newly created directory:
Now let's compile the module. Livestatus uses a standard configure - script and is thus compiled with ./configure && make .
If you are running on a multicore CPU you can speed up compilation by adding - j 4 or - j 8 to make :
. e assim por diante. . After successful compilation, a make install will install a single file named livestatus. o into /usr/local/lib/mk-livestatus and the small program unixcat into /usr/local/bin (as usual, you can change paths with standard options to configure ):
Your last task is to load livestatus. o into Nagios. Nagios is told to load that module and send all status update events to the module by the following two lines in nagios. cfg :
The only mandatory argument is the complete path to the UNIX socket that Livestatus shall create ( /var/lib/nagios/rw/live in our example). Please change that if needed. The best is probably to put it into the same directory as the Nagios pipe. Just as Nagios does with its pipe, Livestatus creates the socket with the permissions 0660 . If the directory that the socket is located in has the SGID bit for the group set ( chmod g+s ), then the socket will be owned by the same group as the directory.
After setting up Livestatus - either by setup. sh or manually - restart Nagios. Two things should now happen: The socket file is created. The logfile of Nagios shows that the module has been loaded:
2.3. Options for nagios. cfg.
Livestatus understands several options, which can be added to the line beginning with broker_module :
Here is an example of how to add parameters:
3. Using Livestatus.
Once your Livestatus module is setup and running, you can use its unix socket for retrieving live status data. Every relevant programming language on Linux has a way to open such a socket. We will show how to access the socket with the shell and with Python. Other programming languages are left as an exercise to the reader.
3.1. Accessing Livestatus with the shell.
A unix socket is very similar to a named pipe, but has two important differences: You can both read and write to and from it (while a pipe is unidirectional). You cannot access it with echo or cat .
Livestatus ships with a small utility called unixcat which can communicate over a unix socket. It sends all data is reads from stdin into the socket and writes all data coming from the socket to stdout.
The following command shows how to send a command to the socket and retrieve the answer - in this case a table of all of your hosts:
If you get that output, everything is working fine and you might want to continue reading with the chapter The Livestatus Query Language .
3.2. Accessing Livestatus with Python.
Access from within Python does not need an external tool. The following example shows how to send a query, retrieve the answer and parse it into a Python table. After installing check_mk you find this program in the directory /usr/share/doc/check_mk :
4. LQL - The Livestatus Query Language.
LQL - pronounced "Liquel" as in "liquid" - is a simple language for telling Livestatus what data you want and how it should be formatted. It does much the same as SQL but does it in another, simpler way. Its syntax reflects (but is not compatible to) HTTP.
Each query consists of: A command consisting of the word GET and the name of a table . An arbitrary number of header lines consisting of a keyword, a colon and arguments. An empty line or the end of transmission (i. e. the client closes the sending direction of the socket)
All keywords including GET are case sensitive . Lines are terminated by single linefeeds (no <CR> ). The current version of Livestatus implements the following tables: hosts - your Nagios hosts services - your Nagios services, joined with all data from hosts hostgroups - you Nagios hostgroups servicegroups - you Nagios servicegroups contactgroups - you Nagios contact groups servicesbygroup - all services grouped by service groups servicesbyhostgroup - all services grouped by host groups hostsbygroup - all hosts group by host groups contacts - your Nagios contacts commands - your defined Nagios commands timeperiods - time period definitions (currently only name and alias) downtimes - all scheduled host and service downtimes, joined with data from hosts and services . comments - all host and service comments log - a transparent access to the nagios logfiles (include archived ones)ones status - general performance and status information. This table contains exactly one dataset. columns - a complete list of all tables and columns available via Livestatus, including descriptions! statehist - 1.2.1i2 sla statistics for hosts and services, joined with data from hosts , services and log .
Like in an SQL database all tables consist of a number of columns. If you query the table without any parameters, you retrieve all available columns in alphabetical order. The first line of the answer contains the names of the columns. Please note that the available columns will change from version to version. Thus you should not depend on a certain order of the columns!
Example: Retrieve all contacts:
4.1. Selecting which columns to retrieve.
When you write an application using Livestatus, you probably need the information just from selected columns. Add the header Columns to select which columns to retrieve. This also defines the order of the columns in the answer. The following example retrieves just the columns name and alias :
If you want to test this with unixcat , a simple way is to put your query into a text file query and read that in using < :
As you might have noticed in this example: if you use Columns: then no column headers will be output. You do not need them - as you have specified them yourselves. That makes parsing simpler.
4.2. Filters.
An important concept of Livestatus is its ability to filter data for you. This is not only more convenient than just retrieving all data and selecting the relevant lines yourself. It is also much faster . Remember that Livestatus has direct access to all of Nagios' internal datastructures and can access them with the speed of native C.
Filters are added by using Filter: headers. Such a header has three arguments: a column name, an operator and a reference value - all separated by spaces. The reference value - being the last one in the line - may contain spaces. Exemplo:
This query gets all services with the current state 2 (critical). If you add more Filter: headers, you will see only data passing all of your filter. The next example outputs all critical services which are currently within their notification period:
The following eight operators are available:
A few notes: The operators.
4.3. Regular expression matching.
4.4. Matching lists.
Some columns do not contain numbers or texts, but lists of objects. An example for that is the column contacts of hosts or services which contains all contacts assigned to the data object. The available operators on list-values columns are:
Example: Return some information about services where "harri" is one of the assigned contacts:
Another example: Return the name of all hosts that do not have parents:
There is a special case when filtering is done on the members or members_with_state columns of the servicegroups table: The value to match must have the form hostname | servicedescription .
4.5. Matching attribute lists.
Version 1.1.4 of Livestatus gives you access to the list of modified attributes of hosts, services, and contacts. This way you can query which attributes have been changed dynamically by the user and thus differ from the attributes configured in the Nagios object files.
These new columns come in two variants: modified_attributes and modified_attributes_list . The first variant outputs an integer representing a bitwise combination of Nagios' internal numbers. The second variant outputs a list of attribute names, such as notifications_enabled or active_checks_enabled . When you define a Filter , both column variants are handled in exactly the same way, and both allow using the number or the comma-separated list representation.
Example 1: Find all hosts with modified attributes:
Example 2: Find hosts where notification have been actively disabled:
Example 3: Find hosts where active or passive checks have been tweaked:
4.6. Combining Filters with And , Or and Negate.
Per default a dataset must pass all filters to be displayed. Alternatively, you can combine a number of filters with a logical "or" operation by using the header Or: . This header takes an integer number X as argument and combines the last X filters into a new filter using an "or" operation. The following example selects all services which are in state 1 or in state 3:
The next example shows all non-OK services which are within a scheduled downtime or which are on a host with a scheduled downtime:
It is also possible to combine filters with an And operation. This is only neccessary if you want to group filters together before "or"-ing them. If, for example, you want to get all services that are either critical and acknowledged or OK, this is how to do it:
The And: 2 - header combines the first two filters to one new filter, which is then "or"ed with the third filter.
In version 1.1.11i2 the new header Negate: has been introduced. This logically negates the most recent filter. The following example displays all hosts that have neither an a nor an o in their name:
5. Stats and Counts.
5.1. Why counting?
SQL has a statement " SELECT COUNT(*) FROM . " which counts the number of rows matching certain criteria. LQL's Stats: - Header allows something similar. In addition it can retrieve several counts at once.
The Stats: - Header has the same syntax as Filter: but another meaning: Instead of filtering the objects it counts them. As soon as at least one Stats: header is used, no data is displayed anymore. Instead, one single row of data is output with one column for each Stats: , showing the number of rows matching its criteria.
The following example outputs the numbers of services which are OK, WARN, CRIT or UNKNOWN:
An example output looks like this:
You want to restrict the output to services to which the contact harri is assigned to? No problem, just add a Filter: header:
5.2. Combining with and/or.
Just as the Filter headers, the Stats - headers can be combined with and and/or or operations. Important to know is, that they form their own stack. You combine them with StatsAnd and StatsOr . Here is a somewhat more complex query that scans all services of the service group windows which are within their notification period and are not within a host or service downtime. It computes seven counts: The number of services with the hard state OK The number of unacknowledged services in hard state WARNING The number of acknowledged services in hard state WARNING The number of unacknowledged services in hard state CRITICAL The number of acknowledged services in hard state CRITICAL The number of unacknowledged services in hard state UNKNOWN The number of acknowledged services in hard state UNKNOWN.
In version 1.1.11i2 the new header StatsNegate: has been introduced. It takes no arguments and logically negates the most recent stats-Filter.
5.3. Grouping.
Letting Livestatus count items is nice and fast. But in our examples so far the answer was restricted to one line of numbers for a predefined set of filters. In some situations you want to get statistics for each object from a certain set. You might want to display a list of hosts, and for each of these hosts the number of services which are OK, WARN, CRIT or UNKNOWN.
In such situations you can add the Columns: header to your query. There is a simple and yet mighty notion behind it: You specify a list of columns of your table. The stats are computed and displayed separately for each different combination of values of these columns.
The following query counts the number of services in the various states for each host in the host group windows :
The output looks like this:
As you can see, an additional column was prepended to the output holding the value of the group column. Here is another example that counts the total number of services grouped by the check command (the dummy filter expression is always true, so each service is counted).
Here is an example output of that query:
A third example shows another way for counting the total number of services grouped by their states without an explicit Stat - header for each state:
In that example none of the services was in the state UNKNOWN. Hence no count for that state was displayed.
One last note about grouping: the current implementation allows only columns of the types string or in to be used for grouping. Also you are limited to one group column.
Note: prior to version 1.1.10 there was the header StatsGroupBy: instead of Columns: . That header is deprecated, though still working.
6. Sum, Minimum, Maximum, Average, Standard Deviation.
Starting from version 1.1.2 Livestatus supports some basis statistical operations. They allow you, for example, to query the average check execution time or the standard deviation of the check latency of all checks.
These operations are using one of the keywords sum , min , max , avg , std , suminv or avginv . The following query displays the minimum, maximum and average check execution time of all service checks in state OK:
As with the "normal" stats-headers, the output can be grouped by one column, for example by the host_name :
In version 1.1.13i1 we introduced the aggregation functions suminv and avginv . They compute the sum or the average of the inverse of the values. For example the inverse of the check_interval of a service is the number of times it is checked per minute. The suminv over all services is the total number of checks that should be executed per minute, if no checks are being delayed.
6.1. Dados de desempenho.
As of version 1.1.11i2 , MK Livestatus now supports aggregation of Nagios performance data. Performance data is additional information output by checks, formatted as a string like user=6.934;;;; system=6.244;;;; wait=0.890;;;; . If you create a Stats-query using sum , min , max , avg or std on several services with compatible performance data, Livestatus will now aggregate these values into a new performance data string. Look at the following examples. First, a query of two services without aggregation:
Let's assume it produces the following output:
Here is the same query, but aggregating the data using the average:
Este é o resultado:
7. Output formatting and character encoding.
Livestatus supports the output formats CSV, JSON and Python, with CSV being the default.
7.1. CSV output.
CSV output comes in two flavors: csv (lowercase) and CSV (uppercase). For backwards compatibility reasons, the lowercase variant is the default, but it has quite a few quirks. The recommendation is to use the uppercase variant, and when you really need more structure in your data. you are much better off with JSON or Python.
csv output (broken)
Datasets are separated by Linefeeds (ASCII 10), fields are separated by semicolons (ASCII 59), list elements (such as in contacts) are separated by commas (ASCII 44) and combinations of host name and service description are separated by a pipe symbol (ASCII 124).
In order to avoid problems with the default field separator semicolon appearing in values (such as performance data), it is possible to replace the separator characters with other symbols. This is done by specifying four integer numbers after the Separators: header. Each of those is the ASCII code of a separator in decimal. The four numbers mean:
The dataset separator (default is 10 : linefeed) The column separator (default is 59 : semicolon) The separator for lists such as contacts in the hosts table (default is 44 : comma) The separator for hosts and services in service lists (default is 124 : vertical bar)
It is even possible to use non-printable characters as separators. The following example uses bytes with the values 0, 1, 2 and 3 as separators:
CSV output.
This is the "real" CSV format (see RFC 4180) which is similar to the lowercase variant above, but with correct quoting and CR/LF as the dataset separator. Because of the quoting, there is no need for the Separators: header, so it is ignored for this format.
7.2. JSON output.
You can get your output in JSON format if you add the header OutputFormat: json , as in the following example:
Like CSV, JSON is a text based format, and it is valid JavaScript code. In order to avoid redundancy and keep the overhead as low as possible, the output is not formatted as a list of objects (with key/value pairs), but as a list of lists (arrays in JSON speak). This is the recommended format in general, as it makes it extremely easy to handle structured data, and JSON parsers are available for basically every programming language out there.
7.3. Python output.
The Python format is very similar to the JSON format, but not 100% compatible. There are tiny difference in string prefixes and how characters are escaped, and this is even different in Python 2 and Python 3. Therefore, two Pythonic formats are offered: python for Python 2 and python3 for, well, Python 3. You can directly eval() the Python output, but be aware of the potential security issues then. When in doubt, use JSON and json. loads from the standard json module.
7.4. Character encoding.
Livestatus output in most cases originates from configuration files of Nagios (the object configuration). Nagios does not impose any restrictions on how these files have to be encoded (UTF-8, Latin-1, etc). If you select CSV output, then Livestatus simply returns the data as it is contained in the configuration files - with the same encoding.
When using JSON or Python - however - non-ASCII-characters need to be escaped and properly encoded. Up to version 1.1.11i1 , Livestatus automatically detects 2-Byte UTF-8 sequences and assumes all other non-ASCII characters to be Latin-1 encoded. While this works well for western languages and to a certain degree "auto-detects" the encoding, it does not support languages using other characters then those used in Latin-1. Even the €-Symbol is not working.
As of version 1.1.11i2 , Livestatus' behaviour is configurable with the option data_encoding and defaults now to UTF-8 encoding. Three different settings are valid:
7.5. Column headers.
Per default, if there is no Columns - header in your query, MK Livestatus displays the names of all columns as a first line of the output. With the header OutputColumns you can explicitely switch column headers on or off . The output to the following query will include column headers:
7.6. Limitting the number of datasets.
The Limits - header allows you to limit the number of datasets being displayed. Since MK Livestatus currently does not support sorting, you'll have to live with the Nagios-internal natural sorting of objects. Hosts, for example, are sorted according to their host names - just as in the standard CGIs. The following example will output just the first 10 hosts:
Please note that the Limit - header is also applied when doing Stats. I'm not sure if there is any use for that, but thats the way MK Livestatus behaves. The following example will count how many of the 10 first hosts are up:
If using filters, the Limit - header limits the number of datasets actually being output. The following query outputs the first 10 hosts which are down:
8. Authorization.
Since version 1.1.3, Livestatus supports addon developers by helping to implement authorization. You can let Livestatus decide whether a certain contact may see data or not. This is very simple to use. All you need to do is to add an AuthUser header to your query with the name of a Nagios contact as single argument. If you do that, Livestatus will only display data that name is a contact for - either directly or via a contact group. Exemplo:
In certain cases it would be possible to replace AuthUser with a Filter header. But that does not work (precisely) in all situations.
8.1. Configuração.
If your addon uses AuthUser , the administrator has a way to configure authentication details via nagios. cfg - and thus can do this uniformely across all addons using Livestatus. Currently two configuration options are available. Both can be set either to strict or loose :
Please note that Nagios makes all services that do not have any contact at all inherit all contacts of the host - regardless whether this option is set to strict or loose .
8.2. Tables supporting AuthUser.
The following tables support the AuthUser header (others simply ignore it): hosts , services , hostgroups , servicegroup and log . The log - table applies the AuthUser only to entries of the log classes 1 (host and service alerts), 3 (notifications) and 4 (passive checks). All other classes are not affected.
8.3. Limitações.
Currently the AuthUser - header only controls which rows of data are output and has no impact on list columns, such as the groups column in the table services . This means that this column also lists service groups the contact might not be a contact for. This might be changed in a future version of Livestatus.
Starting with version 1.1.3 Livestatus has a new and still experimental feature: Waiting. Waiting allows developers of addons to delay the execution of a query until a certain condition becomes true or a Nagios event happens. This allows the implementation of a new class of features in addons, for example:
An immediate update of a status display as soon as the status of any or one specific Nagios object changes. A logfile ticker showing new log messages immediately. An action button for rescheduling the next check of a service which displays the service not sooner than after it has been checked.
All that can be implemented without polling - and in a very simple way. All you have to do is to make up some new query headers:
Specifying multiple condition headers is allowed: All conditions are combined with a boolean and (just as with the Filter header).
The following triggers are available for the WaitTrigger - Header: check - a service or host check has been executed state - the state of a host or service has changed log - a new message has been logged into nagios. log downtime - a downtime has been set or removed comment - a comment has been set or removed command - an external command has been executed program - a change in a global program setting, like enable_notifications all - any of the upper events happen (this is the default)
9.1. Exemplos.
Retrieve log messages since a certain timestamp, but wait until at least one new log message appeares:
The same, but do not wait longer than 2 seconds:
Retrieve the complete data about the host xabc123 , but wait until its state is critical:
Get data about the service Memory used on host xabc123 as soon as it has been checked some time after 1265062900:
10. Compensating timezone differences.
When doing multi-national distributed monitoring with Livestatus you might have to deal with situations where your monitoring servers are running in different time zones. In an ordinary setup all servers will have the same system time but different configured time zones. You can check this by calling on each monitoring server:
This command should output the same value on all servers. If not, you've probably set your system to a wrong time zone. MK Livestatus can help to compensate the time difference in such situations. If you add the header.
to your query with your current local time (the output of date +%s ) as an argument, Livestatus will compare its local time against that of the caller and convert all timestamps accordingly.
Please note that Livestatus assumes that a difference in time is not due to clock inaccuracy but due to timezone differences. The delta time computed for compensating will be rounded to the nearest half hour .
11. Response Header.
If your request is not valid or some other error appears, a message is printed to the logfile of Nagios. If you want to write an API that displays error message to the user, you need information about errors as a part of the response.
You can get such behaviour by using the header ResponseHeader . It can be set to off (default) or to fixed16 :
Other types of response headers might be implemented in future versions. The fixed16 - header has the advantage that it is exactly 16 bytes long. This makes it easy to program an API. You simply can read in 16 bytes and need not scan for a newline or stuff like that. Here is a complete example session with response headers being activated:
The fixed16 response header has the following format: Bytes 1-3: status code Byte 4: a single space Byte 5-15: the length of the response as an ASCII coded integer number, padded with spaces Byte 16: a linefeed character (ASCII 10)
These are the possible values of the status code:
The reponse contains the queried data only if the status code is 200 . In all other cases the reponse contains the error message. In that case the length field gives the length of the error message including the trailing linefeed. It is not JSON-encoded, even if you set that in the OutputFormat - header.
12. Keep alive (persistent connections)
MK Livestatus allows you to keep open a connection and reuse it for several requests. I order to do that you need to add the following header:
Livestatus will keep open the connection after sending its response and wait for a new query in that case. You probably also will activate a response header in that case, since only this allows you to exactly determine the length of the response (without KeepAlive you can simply read until end of file).
Please note that keeping up a connection permanently occupies ressources within the Nagios process. In the current version Livestatus is limited to ten parallel persistent connections. This is different from the way persistent database connections are handled.
The proposed way to use persistent connections in web applications is to keep the connection open only during the current request and close it after the complete result page has been rendered. The reason is that bringing up a database connection is a much more costly operation than connecting to MK Livestatus.
13. Access to Logfiles.
Since version 1.1.1 Livestatus provides transparent access to your Nagios logfiles, i. e. nagios. log and the rotated files in archives (you might have defined an alternative directory in nagios. cfg). Livestatus keeps an index over all log files and remembers which period of time is kept in which log file. Please note that Livestatus does not depend on the name of the log files (while Nagios does). This way Livestatus has no problem if the log file rotation interval is changed.
The Livestatus table log is your access to the logfiles. Every log message is represented by one row in that table.
13.1. Performance issues.
If your monitoring system is running for a couple of years, the number of log files and entries can get very large. Each Livestatus query to the table log has the potential of scanning all historic files (although an in-memory cache tries to avoid reading files again and again). It is thus crucial that you use Filter: in order to restrict:
The time interval The log classes in question.
If you set no filter on the column time , then all logfiles will be loaded - regardless of other filters you might have set.
Setting a filter on the column class restricts the types of messages loaded from disk. The following classes are available:
0 - All messages not in any other class 1 - host and service alerts 2 - important program events (program start, etc.) 3 - notifications 4 - passive checks 5 - external commands 6 - initial or current state entries 7 - program state change.
14. RRD Files of PNP4Nagios.
New in 1.1.9i3 : In order to improve the integration between Multisite and PNP4Nagios, Livestatus introduces the new column pnpgraph_present in the tables hosts and services (and all other tables containing host_ or service_ columns). That column can have three possible values:
Livestatus cannot detect the base directory to your RRD files automatically, so you need to configure it with the module option pnp_path :
In order to determine Livestatus the availability of the PNP graph it checks for the existance of PNPs. xml file.
A note for OMD users: OMD automatically configures this option correctly in etc/mk-livestatus/nagios. cfg . You need at least a daily snapshot of 2018-12-17 or later for using the new feature.
15. Expansion of macros.
Nagios allows you to embed macros within your configuration. For example it is usual to embed $HOSTNAME$ and $SERVICEDESC$ into your action_url or notes_url when configuring links to a graphing tool.
As of version 1.1.1 Livestatus supports expansion of macros in several columns of the table hosts and services . Those columns - for example notes_url_expanded - bear the same name as the unexpanded columns but with _expanded suffixed.
Since macro expansion is very complex in Nagios. And unfortunately the Nagios code for that is not thread safe, so Livestatus has its own implementation of macros, which does not support all features of Nagios, but (nearly) all that are needed for visualization addons. Livestatus supports the following macros:
for hosts and services: HOSTNAME , HOSTDISPLAYNAME , HOSTALIAS , HOSTADDRESS , HOSTOUTPUT , LONGHOSTOUTPUT , HOSTPERFDATA , HOSTCHECKCOMMAND for services: SERVICEDESC , SERVICEDISPLAYNAME , SERVICEOUTPUT , LONGSERVICEOUTPUT , SERVICEPERFDATA , SERVICECHECKCOMMAND all custom macros on hosts and services (beginning with _HOST or _SERVICE ) all $USER. $ macros.
16. Remote access to Livestatus via SSH or xinetd.
16.1. Livestatus via SSH.
Livestatus current does not provide a TCP socket. Another (and more secure) way for remote accessing the unix socket is using SSH. The following example sends a query via SSH. The only priviledge the remote user needs, is write access to the unix socket:
16.2. Livestatus via xinetd.
Using xinetd and unixcat you can bind the socket of Livestatus to a TCP socket. Here is an example configuration for xinetd:
You can access your socket for example with netcat :
17. Timeouts.
In version 1.1.7i3 the handling of timeouts has changed. There are now two configurable timeouts which protect Livestatus from broken clients hanging on the line for ever (remember that the maximum number of parallel connections is configurable but limited):
idle_timeout - Limits the time Livestatus waits for a (the next) query query_timeout - Limits the time a query needs to be read.
A Livestatus connection has two states: either Livestatus is waiting for a query. This is the case just after the client has connected, but also in KeepAlive - mode after the response has been sent. The client has now at most idle_timeout ms for starting the next query. The default is set to 300000 (300 seconds, i. e. 5 minutes). If a client is idle for more then that, Livestatus simply closes the connection.
As soon as the first byte of a query has been read, Livestatus enters the state "reading query" and uses a much shorter timeout: the query_timeout . Its default value is 10000 (10 secs). If the client does not complete the query within this time, the client is regarded dead and the connection is closed.
Both timeout values can be configured by Nagios module options in nagios. cfg . A timeout can be disable by setting its value to 0 . But be warned: Broken clients can hang connections for ever and thus block Livestatus threads.
18. Sending commands via Livestatus.
MK Livestatus supports sending Nagios commands. This is very similar to the Nagios command pipe, but very useful for accessing a Nagios instance via a remote connection.
You send commands via the basic request COMMAND followed by a space and the command line in exactly the same syntax as needed for the Nagios pipe. No further header fields are required nor allowed.
Livestatus keeps the connection open after a command and waits for further commands or GET - requests. It behaves like GET with KeepAlive: set to yes . That way you can send a bunch of commands in one connection - just as with the pipe. Here is an example of sending a command from the shell via unicat :
Just as with GET , a query is terminated either by closing the connection or by sending a newline. COMMAND automatically implies keep alive and behave like GET when KeepAlive is set to on . That way you can mix GET and COMMAND quries in one connection.
19. Stability and Performance.
19.1. Stability.
While early versions of MK Livestatus experienced some stability issues - not unusual for evolving software - nowadays it can be considered rock solid. There are no known problems with performance, crashes or a hanging Nagios, as long as two important requirements are fullfilled: Environment macros have to be disabled in nagios. cfg . This is done with:
19.2. Atuação.
Livestatus behaves with respect to your CPU and disk ressources. It doesn't do any disk IO at all, in fact - as long as the table log is not accessed, which needs read access to the Nagios' log files. CPU is only consumed during actual and queries and even for large queries we rather speek of micro seconds then of milli seconds of CPU usage. Furthermore, Livestatus does not block Nagios during the execution of a query but is running totally in parallel - and scales to all available CPU cores if neccessary. 1.1.9i3 Timeperiod transitions">
20. 1.1.9i3 Timeperiod transitions.
Version 1.1.9i3 introduces a new little feature, that does not really have something to do with status queries but is very helpful for creating availability reports and was easy to implement in Livestatus (due to its timeperiod cache).
Each time a timeperiod changes from active to not active or vice versa, an entry in the Nagios logfile is being created. At start of Nagios the initial states of all timeperiods are also logged. This looks like this:
When a transition occurs one line is logged (here the state changed from 1 (in) to 0 (out).
With that information, it is later possible to determine, which timeperiods were active when an alert happened. That way you can make availability reports reflect only certain time periods.
21. Host and Service Availability.
21.1. Introdução.
Version 1.2.1i2 introduces the new table statehist which supports availability queries - providing statistical information for hosts and service. Beside the state information, this table returns duration information regarding the length of the state. In addition the duration percentage in respect to the query timeframe can be returned.
Each change creates an output line with the respective duration. Additional columns show the part (percentage) of this duration in comparison to the queried timeframe. To get the overall percentage of a specific state you can use the Stats: header to accumulate the percentage fields of multiple lines.
21.2. Absence of host and services.
To identify the absence of host and services within the queried timeperiod correctly it is necessary to set the following parameter.
Setting this parameter to 1 results that the initial state of each host and service is logged during the programs startup. By evaluating each startup it is possible to detect if a host or service is no longer monitored by the system. It is even possible to detect if this host or service was temporarily removed from monitoring for a specific time. The absence of a host or service is reflected in the output line within the state column as -1 (UNMONITORED)
The setting log_initial_states=1 is the default parameter as of version 1.2.1i2.
Disabling this parameter lead to less logfile entries on the programs startup, but limits the correct detection of the UNMONITORED state.
21.3. Table statehist.
Querying the table statehist results in an output which shows the a hosts/services states as mentioned above and in addition how long this host/service resided in this state.
A query always requires a filter for the start time. Otherwise livestatus would parse all available logfiles from the beginning, which might add up to several hundred megabytes.
Outputs a list where the state of this service has changed joined with the duration information.
By using the Stats: header these lines can be accumulated and allows the output of distinct stats duration and their duration_part in respect to the queried timeframe.
Using the columns duration_part_ok, duration_part_warning, duration_part_critical allows to output the entire state information within a single line.
Converting the part values into percentages the SLA information for this service is 70.1% OK 3.8% WARNING 26.1% CRITICAL.
No comments:
Post a Comment