Skip to content

GitLab

  • Projects
  • Groups
  • Snippets
  • Help
    • Loading...
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in
A
ansible
  • Project overview
    • Project overview
    • Details
    • Activity
    • Releases
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
  • Issues 36
    • Issues 36
    • List
    • Boards
    • Labels
    • Service Desk
    • Milestones
  • Operations
    • Operations
    • Incidents
  • Analytics
    • Analytics
    • Repository
    • Value Stream
  • Snippets
    • Snippets
  • Members
    • Members
  • Collapse sidebar
  • Activity
  • Graph
  • Create a new issue
  • Commits
  • Issue Boards
  • las-it-organisation
  • 32-0-IT instructions and rules
  • ansible
  • Issues
  • #35

Closed
Open
Opened Jul 06, 2018 by julian.gethmann@gethmannOwner

Icinga

Host: las126.las.kit.edu, las100, las101, +Opt-In

OS: Fedora, CentOS

Software name:

Icinga2 or other monitoring software

Software installation instruction if not in repos:

  • Temperatures
  • HDD live and
  • Load
  • Network connectivity are very simple to install as far as I know.

Status of our services

  • DHCPd

More difficult/not implemented yet, but basic features might be detectable with other modules:

  • IPA functionality

Probably there are already roles in the ansible-Galaxy.

Possibly also interesting for:

Clients as Opt-In, because it causes privacy issues (admins can see for how long the computer was turned on and how long a user was logged in, to name just a few)

User stories (kind of):

Clients:

  • The user starts a job on his computer and he cannot log-in at the next morning. Is the computer gone for good? Is it just still to busy to take care of things like the log-in-manager? Are the hard-drives gone, because of the room heated up? -> Get hints of the cause of the problem.
  • The user cannot log-in. Maybe IPA the network is down and therefore she cannot log-in, maybe IPA is down, maybe she just typed a wrong password.

Server:

  • IPA went down and nobody notices it, because sssd caches it and no log-in errors occurred until half a year later. Then one can find out, since when IPA was not working and if a update might have triggered it. Or one can prevent it in the first place, by regularly monitoring the monitoring software.
  • DHCPd went down and nobody notices it, because the workstations work with fixed IPs
  • Docker GitLab-runner do not work and jobs have to fail to recognize it. Maybe an system update caused this and not a reboot without autostart.
  • sharelatex is down and one gets a mail/call from CN, because they want to collaborate on a paper that needs to be submitted the next day.

/cc @project-manager

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information
Assignee
Assign to
None
Milestone
None
Assign milestone
Time tracking
None
Due date
None
Reference: las-it-organisation/32-0-IT-InstructionsAndRules/ansible#35