What is anomaly detection
Anomaly detection also called outlier detection finds unusual data points that deviate from normal patterns.
intuition —> learn what normal looks like from unlabeled data —> score how strange each new event is —> sort by score for analyst review
Why security teams use it
- Surface rare threats when labels are scarce or delayed
- Reduce noise by focusing on top unusual events
- Hunt at scale across logs, DNS, email, auth, processes
How it works
- Collect normal-ish history past data assumed mostly benign
- Learn normal fit a model of typical behavior or density
- Score new items higher score = more unusual
- Threshold for action pick a cutoff that matches analyst capacity
Output is usually a continuous score
score —> threshold —> action
Key algorithms
Isolation based
- Isolation Forest randomly splits features and measures how quickly a point gets isolated. Rare points isolate fast —> high anomaly score.
- HBOS histogram based outlier score. Unlikely bins in independent histograms raise the score.
Density based
- Local Outlier Factor compares local density to neighbors. Much sparser than neighbors —> anomalous.
- Gaussian Mixture models normal as a mix of Gaussians. Low likelihood under the model —> anomalous.
Boundary based
- One Class SVM learns a frontier that encloses normal. Points outside the frontier —> anomalous.
Reconstruction based
- PCA residual project to principal components and measure reconstruction error.
- Autoencoder compress then reconstruct. Large error —> anomalous.
Time series quick wins
- Rolling z score seasonality aware baselines per hour or weekday.
- Change point detect sudden level or variance shifts.
Security examples that click
- Login anomalies geo velocity, device novelty, hour bins, failed attempts —> flag unusual combinations
- DNS exfil request size ratio, domain length, n gram scores, NXDOMAIN rate —> flag high scoring flows
- Process tree anomalies rare parent child pairs, command token rarity —> surface odd chains
- Email intake character n grams and URL structure features —> score unusual messages for deeper inspection
Feature engineering and scaling
- Scale numeric features standardize so one field does not dominate distance
- Use ratios and rates per user or asset to normalize volume effects
- Bucket rare categories keep top frequent values and map the tail to other
- Time awareness include hour of day, day of week, and rolling baselines
- Dimensionality reduction PCA or Truncated SVD before distance based detectors
Thresholds and operations
- Set a daily alert budget first for example top 500 events per day
- Choose threshold to hit that budget on validation data
- Provide context with each alert why scored high, top contributing features
- Maintain a feedback loop analysts mark useful or not useful to recalibrate
Evaluation that matches reality
- Use time aware splits fit on past —> score next period
- Label a small sample of top scored items weekly for truth checks
- Track Precision at k how many top k are truly interesting
- Plot PR curves when you have enough spot labels
- Measure time to detect and analyst effort saved
- Slice results by user, tenant, sender, asset class to find blind spots
Practical workflow
- Define scope which stream and what volume you can triage
- Assemble features human readable first, then enrich
- Split by time training window for learning normal, next window for tuning
- Run a small grid Isolation Forest, LOF, One Class SVM, PCA residual, Autoencoder
- Pick threshold to meet alert budget with best Precision at k
- Ship to trial include top reasons and raw evidence
- Monitor weekly drift in features, score distribution, and precision
- Retrain on a cadence rolling window to track changing normal
Pitfalls and fixes
- Concept drift normal changes over time
fix —> rolling retrains, windowed baselines, monitor precision weekly - High dimensional distance failure everything looks equally distant
fix —> PCA or feature selection before distance based methods - Scale sensitivity a single large ranged feature dominates
fix —> standardize and sanity check ranges at inference - Too many alerts threshold set too low
fix —> start with strict top k, widen only if precision stays high - Contamination in training many hidden anomalies in the training window
fix —> robust methods Isolation Forest, robust scalers, clean periods where possible - Black box scores analysts need reasons
fix —> include top contributing features or nearest neighbors for context
Common hyperparameters
- Isolation Forest
n_estimators,max_samples,max_features,contaminationexpected anomaly rate - Local Outlier Factor
n_neighbors,contamination - One Class SVM
kernel,nuexpected outlier fraction,gammafor RBF - PCA residual
n_componentsretained variance target - Autoencoder depth and width,
dropout, trainingepochs, reconstruction loss
Security focused testing checklist
- Lock scalers and compressors to the training window reuse for scoring
- Verify ranges and units at inference guard against bad inputs
- Run top k review and record precision each week
- Check drift feature histograms, score quantiles, cluster composition
- Add context fields per alert top features, nearest normal neighbors, example peers
- Backstop with simple rules to catch obvious misses and cap blast radius
- Keep a holdout period for periodic unbiased checks
Threats and mitigations
- Adversarial blending attacker mimics common patterns to hide in the crowd
- mitigate —> richer features that are hard to fake, per cohort baselines, supervised backstops
- Poisoning normal injected artifacts shift what looks normal
- mitigate —> change control on labels and training windows, robust estimators, outlier filtering
- Feature manipulation attacker changes controllable fields
- mitigate —> combine independent signals, include environment side features domain age, ASN, cert fingerprints
Takeaways
- Anomaly detection turns unlabeled streams into ranked leads
- Start simple Isolation Forest or PCA residual with time aware thresholds
- Measure Precision at k, keep alerts explainable, and retrain on a rolling window
- Treat anomalies as investigation cues not verdicts, backed by supervised or rule based checks
[Original Source](No response)