How Can We Help?
Section 23 – Alerting and Monitoring
Section 23 Alerting and Monitoring
Alerting:
True Positives – maximize
False Negatives – minimize
Monitoring can be automated or manual
often include alerting mechanisms
will cover:
Monitoring Resources
Alerting and Monitoring Activities
SNMP Simple Network Management Protocol – used to manage devices
SIEM Security info and Event Management systems
agent-based monitoring
agentless monitoriing
Data created by Security Tools
SCAP Security Content Automation Protocol
Network Traffic Flows, or Flow
Single Pane of Glass concept (SPOG) – consolidated unified dashboard display of data
Monitoring Resources:
observing performance of a system – CPU, processor overhead, RAM etc
we need a BASELINE reference point for “normal” first: for expected performance under normal standard operating conditions
to compare over time with actual
any deviations may indicate problems
we can then investigate why it isnt within the baseline
eg Windows Performance Monitor tool
Application Monitoring or App Perf Monitoring
eg AppDynamic or NewRelic
these can monitor apps in real time, if app is slow, could be that additional server resources are needed or that the app might have a code problem
Infra Monitoring – eg Solarwinds – info about network traffic, status of devices etc
can indicate if extra network or server resources are needed
can ID issues, ensure perf efficency
Log Aggregation:
collecting all log data into one location
for perf analysis, security issues, investigation of breaches, and to use as evidence of criminal activity
also for compliance with govt regulations
Alerting,:
setting up notifications to inform stakeholders – sys admins etc of events
eg low RAM, or CPU overload, failed logins..
some regs require govt orgs to be immediately informed in cases of data breach.
Scanning:
analyze system using – eg nessus, openvas, qualys
checks for vulnerabilities vs CVE db check and compare
code scans : of source code of apps eg Sonarcube
Reporting:
generating either summaries or detailed reports based on the data collected.
needed for internal audit as well as compliance requiremants
Achiving:
storing data for long term retention
can also use cloud eg S3 or Google Cloud
compliance may require all data retained for several years – hot or cold storage
Alert Response and Remediation or Validation:
taking actions in response to issues and alerts… investigate, escalating to another team,
Remediation: patching, reconfiguring, modifying source code to rectify id’d issues.
2 other actions:
quarantining: isolating a system from network to prevent a threat spreading
Alert tuning, we adjust the alert paramenters to avoid false positives, to reduce alert level..eg. if we get too many alerts about a minor CPU spike, so we might retune the alert to reduce the number of alerts being generated if not crucial info