I started by enabling Azure Activity Logs at the subscription level and routing them through Diagnostic Settings into an Event Hub. This immediately highlighted how different control plane logs are from application logs, both in structure and in intent.
For application visibility, I configured App Service logging and sent those logs through a separate Event Hub. This separation was intentional. I wanted to be able to reason about control plane activity independently from web traffic noise.
To move data into Splunk, I built an Azure Function that consumes events from Event Hub and forwards them to Splunk using the HTTP Event Collector. I intentionally kept the function simple so failures would be obvious.
During testing, I used ngrok to expose the function endpoint and inspect raw payloads. This step caught several issues early, including unexpected nesting, missing fields, and assumptions I had made about event structure.
One of the biggest adjustments was realizing that not all Azure logs arrive as single events. Some arrive as arrays of records, which meant my function and my searches had to explicitly handle that structure.
Once data was reliably landing in Splunk, I focused on normalization in search rather than perfect ingestion-time parsing. This made it easier to iterate and prevented early schema decisions from blocking progress.
From there, I built dashboards that answered very specific questions. Is data flowing. Who is making control plane changes. Which operations are failing. Are there patterns that look like misuse rather than normal administration.
Only after the dashboards made sense did I create alerts. The alerts were not the goal. They were the proof that the pipeline supported real detection logic.