Data Retention Policy

30-day retention window · Two-layer architecture

Overview

SikkerAPI processes two distinct categories of threat intelligence data, each with its own retention mechanism. Both operate on a 30-day window, but they work differently:

Layer	Retention	Mechanism
Raw event telemetry	30 days from event time	Automatic chunk expiry
IP reputation data	30 days of inactivity	Daily scheduled purge

The key distinction: raw telemetry drops by age (30 days after it was recorded), while reputation data drops by inactivity (30 days since the IP was last seen by any source). An IP that remains active will keep its reputation indefinitely, even as the underlying raw events expire.

Data Processed

IP addresses + timestamps + attack patterns

No usernames, passwords, payloads, or PII beyond IP

Why two layers?

Raw telemetry is granular per-event data used internally for pattern matching and analysis. It's expensive to store and not directly useful to API consumers.

Reputation data is the pre-aggregated result — what the API actually returns. It's compact, fast to query, and contains everything needed for confidence level calculation.

Raw Event Telemetry

Raw telemetry is the per-event data captured by our honeypot sensor network. This includes individual sessions, commands, authentication attempts, queries, file operations, and editorial pattern matches across all 16 monitored protocols.

This data is internal — customers never see it directly through the API. It's stored in time-partitioned tables (one partition per day) and automatically dropped after 30 days by partition age.

Data Type	Examples
Sessions	SSH connections, HTTP requests, database logins
Commands	Shell commands executed, SQL queries run
Auth attempts	Credential pairs tried against each protocol
File operations	Files written, transferred, or accessed
Pattern matches	Behavioral primitives and attack patterns identified
Community reports	Individual reports submitted by contributors

Dropping raw telemetry has no effect on the data returned by the API. All relevant information has already been aggregated into the IP reputation layer at the time it was recorded.

Retention Rule

Dropped by partition age: 30 days after event timestamp

Runs daily via database background job

Example: Telemetry lifecycle

Jan 15: SSH session recorded (raw event stored)
Jan 15: Reputation row updated with aggregated data

Feb 14: Raw event partition expires and is dropped
Feb 14: Reputation row unchanged — data was
pre-aggregated at write time

IP Reputation Data

IP reputation is the customer-facing data layer. Each IP address has a single row containing pre-aggregated counters, behavioral profiles, protocol breakdowns, geolocation, and a computed confidence level. This is what the API returns when you query an IP.

All data is aggregated at write time. When a sensor observes a session or a contributor submits a report, the reputation row is updated immediately — counters are incremented, behavior maps are merged, and the confidence level is recalculated. The raw events can then expire without any loss to the reputation data.

What's stored per IP

Field	Description
confidenceLevel	Computed 0-100 score (see algorithm)
protocols	Per-protocol session and event counts
behaviors	Matched attack patterns with severity and count
firstSeen / lastSeen	Timestamp range of sensor activity
reports	Total reports, unique reporters, category breakdown
geo	Country, city, ASN, Tor/proxy flags

Purge condition

A reputation row is deleted only when both of the following are true:

•No sensor has observed the IP in the last 30 days
•No contributor has reported the IP in the last 30 days

If either source remains active, the row survives. An IP that is still being attacked or still being reported will keep its full reputation history.

Purge Rule

DELETE WHERE sensor_last_seen < cutoff

AND contributor_last_reported < cutoff

cutoff = now - 30 days

Example: Active IP

IP 203.0.113.42 has been attacking for 60 days.

sensor_last_seen = today (still active)
contributor_last_reported = 45 days ago

Sensor timestamp is within 30 days
Row survives — full history preserved

Example: Inactive IP

IP 198.51.100.7 went silent 35 days ago.

sensor_last_seen = 35 days ago
contributor_last_reported = 40 days ago

Both timestamps exceed 30-day cutoff
Row purged at next daily run (03:00 UTC)

If IP returns later, a fresh row is
created from the first new event.

Effect on Confidence Levels

The confidence level algorithm calculates scores entirely from the pre-aggregated fields on the reputation row. It does not read from the raw event tables.

This means:

•Scores do not degrade when raw telemetry expires. The aggregated behavior counts, protocol breakdowns, and session totals remain on the reputation row.
•Scores only disappear when the entire reputation row is purged (30 days of inactivity from all sources).
•Scores are recalculated on every new event or report, incorporating the latest evidence alongside all previously aggregated data.

Recency filtering

The confidence level represents total evidence strength without time decay. For recency-sensitive use cases, the API provides firstSeen and lastSeen timestamps so you can apply your own filtering logic.

See Confidence Level Algorithm §05 for details on recency via timestamps.

Score stability timeline

Day 1: IP attacks via SSH and HTTP
→ Reputation row created, score = 72

Day 15: More brute-force activity
→ Score recalculated to 85

Day 31: Raw telemetry from Day 1 expires
→ Score unchanged at 85
(aggregated data persists on row)

Day 45: IP goes silent
→ Score still 85, lastSeen = Day 15

Day 46: No activity for 31 days
→ Row purged by daily retention job
→ API returns "not found" for this IP

After Purge

When an IP's reputation row is purged, the IP is no longer present in the system. API lookups return no data, and the IP is excluded from blacklists and threat feeds.

If the IP resumes suspicious activity after purge, a fresh reputation row is created from the first new event. The previous history is not recovered — the IP starts with a clean slate and its confidence level rebuilds from new evidence only.

This is by design. An IP that went silent for 30+ days and returns may have changed hands, been remediated, or shifted to a different actor. Fresh evidence is more accurate than stale history.

Why not preserve history?

IP addresses are not permanent identities. Dynamic allocation, NAT, VPNs, and hosting provider churn mean the entity behind an IP can change at any time.

Carrying forward stale reputation would penalize new occupants of an address for the actions of previous ones. The 30-day window ensures that reputation reflects current activity, not historical artifacts.

IP Removal Requests

If your IP address appears in SikkerAPI and you believe it has been listed in error, or you have remediated the underlying issue, you can request removal.

Submit a request at sikkerapi.com/removeme. Approved requests delist the IP immediately and apply a grace period during which new sensor activity does not re-list the IP.

Delisted IPs are still subject to the standard 30-day retention purge. If the IP has no new activity from any source for 30 days, the row is permanently deleted regardless of delist status.

What delisting does

An approved request removes the IP from public API results, blacklists, and threat feeds immediately.

A grace period prevents automatic re-listing from new sensor activity, giving you time to remediate.

After 30 days of inactivity the row is permanently deleted regardless of delist status.

Legal Basis

Processing of IP addresses for threat intelligence is conducted under Legitimate Interest (GDPR Article 6(1)(f)), supported by Recital 49 of the GDPR, which specifically authorizes processing that is “strictly necessary and proportionate for the purposes of ensuring network and information security.”

Recital 49 explicitly covers the prevention of unauthorized access to networks, the distribution of malicious code, and the stopping of denial-of-service attacks — all of which are core to what our honeypot sensors detect and report.

Data minimization is enforced through the 30-day retention window. Only the minimum data necessary for threat assessment is stored: IP addresses, timestamps, and attack pattern metadata. No payloads, no log contents, and no user account information.

For full details, see the Privacy Policy.

GDPR Recital 49

“The processing of personal data to the extent strictly necessary and proportionate for the purposes of ensuring network and information security [...] constitutes a legitimate interest of the data controller concerned.”

“This could, for example, include preventing unauthorised access to electronic communications networks and malicious code distribution and stopping ‘denial of service’ attacks.”