Home Automating Custom Log Ingestion into Microsoft Sentinel with Azure DevOps (Part 2)
Post
Cancel

Automating Custom Log Ingestion into Microsoft Sentinel with Azure DevOps (Part 2)

In my last blog post, I covered how to set up the Data Collection Endpoint (DCE), create a custom table, and parse Apache logs into JSON format so they can be uploaded to Sentinel via Log Ingestion API. In this post we’ll build a pipeline in Azure Devops that will automate the entire process.

Retrieving Required API Information

Before we can make API calls to upload data, we need to gather some important information from the resources we created in Part 1.

Getting the DCR Immutable ID

The Data Collection Rule (DCR) Immutable ID is a unique identifier that never changes, even if you rename the DCR. This is what the API uses to route your data.

To find it:

  • Navigate to Azure portal
  • Go to your Log Analytics Workspace
  • Click on Tables in the left pane under Settings
  • Find your custom table (the one you created in Part 1)
  • Click on the table name and then
  • Click three dots next to the table name and click ‘Manage Table’
  • Click DCR name against field ‘Data Collection Rule’

  • In the overview pane, you’ll see Data Collection Rule
  • Click on the DCR name to open it
  • In the DCR overview, Look for the immutableId field and copy its value

The immutable ID will look something like: dcr-a1b2c3d4************************

Confirming the DCE Endpoint

You should already have this from Part 1, but to verify:

  • Navigate to Data Collection Endpoints in Azure portal
  • Click on your DCE
  • Go to JSON View
  • Look for logsIngestion under the properties section
  • Copy the endpoint URL

It will look like: https://your-dce-name.australiasoutheast-1.ingest.monitor.azure.com

Getting the Stream Name

The stream name is derived from your custom table name. If your table is called CustomApacheLogs_CL, then your stream name is simply CustomApacheLogs_CL.

You can verify this:

  • Go to your DCR in Azure portal
  • Click on JSON View
  • Look under dataFlowsstreams
  • You’ll see something like Custom-CustomApacheLogs_CL

Note: When making API calls, you’ll use Custom- prefix with your stream name.

Putting it all together

The ingestion URL is built by combining the base endpoint with the DCR immutable ID and stream name, then appending the API version. This creates the full path your client uses to send logs into Azure Monitor. Let me break down how the URL is constructed using the values from our setup:

1
{DCE_ENDPOINT}/dataCollectionRules/{DCR_IMMUTABLE_ID}/streams/{STREAM_NAME}?api-version={VERSION}

Example breakdown:

  • Base DCE endpoint: https://your-dce-name.australiasoutheast-1.ingest.monitor.azure.com
  • DCR immutable ID: dcr-a1b2c3d4e5f6g7h8i9j0k1l2m3n4o5p6
  • Stream name: Custom-CustomApacheLogs_CL
  • API version: api-version=2023-01-01
1
https://your-dce-name.australiasoutheast-1.ingest.monitor.azure.com/dataCollectionRules/dcr-a1b2c3d4e5f6g7h8i9j0k1l2m3n4o5p6/streams/Custom-CustomApacheLogs_CL?api-version=2023-01-01

Setting Up Service Principal Authentication

The Log Ingestion API requires authentication. We’ll use a service principal (App Registration) for this.

Create an App Registration

  • In Azure portal, navigate to Microsoft Entra ID
  • Click App registrations in the left pane under Manage
  • Click New registration  Add app registration

  • Give it a name (e.g., “SentinelLogIngestion”)
  • Leave other options as default and click Register

Once created, note down:

  • Application (client) ID - you’ll see this on the overview page
  • Directory (tenant) ID - also on the overview page

Create a Client Secret

  • In your app registration, click Certificates & secrets
  • Click New client secret
  • Add a description and select expiration period
  • Click Add

 Add Secrets

  • Important: Copy the secret value immediately. You won’t be able to see it again.

Assign Permissions to the DCR

The service principal needs permission to push data to your DCR.

  • Navigate to your Data Collection Rule
  • Click Access control (IAM) in the left pane
  • Click AddAdd role assignment
  • Search for and select Monitoring Metrics Publisher
  • Click Next
  • Select User, group, or service principal
  • Click Select members
  • Search for your app registration name
  • Select it and click Select
  • Click Review + assign

Testing the API Manually with PowerShell

Before building the full pipeline, it’s good practice to test the API call manually. Here’s a PowerShell script to upload your JSON data to Sentinel:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
# Configuration Variables - Update these with your values
$TenantId = "your-tenant-id"
$ClientId = "your-client-id"
$ClientSecret = "your-client-secret"
$DcrImmutableId = "dcr-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
$DceEndpoint = "https://your-dce-name.eastus-1.ingest.monitor.azure.com"
$StreamName = "CustomApacheLogs_CL"
$BatchSize = 100
$InputJsonPath = "C:\path\to\parsed_apache_logs.json"

# Function to get OAuth token
function Get-AuthToken {
    try {
        $body = @{
            grant_type    = "client_credentials"
            client_id     = $ClientId
            client_secret = $ClientSecret
            scope         = "https://monitor.azure.com/.default"
        }
        
        $headers = @{
            'Content-Type' = 'application/x-www-form-urlencoded'
        }
        
        $uri = "https://login.microsoftonline.com/$TenantId/oauth2/v2.0/token"
        
        Write-Host "Getting authentication token..." -ForegroundColor Yellow
        $response = Invoke-RestMethod -Uri $uri -Method Post -Body $body -Headers $headers
        
        Write-Host "Authentication successful" -ForegroundColor Green
        return $response.access_token
    }
    catch {
        Write-Error "Failed to get authentication token: $($_.Exception.Message)"
        throw
    }
}

# Function to send data to Sentinel
function Send-DataToSentinel {
    param(
        [array]$Data,
        [string]$AccessToken
    )
    
    try {
        $headers = @{
            'Authorization' = "Bearer $AccessToken"
            'Content-Type'  = 'application/json'
        }
        
        $uri = "$DceEndpoint/dataCollectionRules/$DcrImmutableId/streams/Custom-$StreamName" + "?api-version=2023-01-01"
        
        $jsonPayload = $Data | ConvertTo-Json -Depth 10 -Compress
        
        $response = Invoke-RestMethod -Uri $uri -Method Post -Body $jsonPayload -Headers $headers
        
        return $response
    }
    catch {
        Write-Error "Failed to send data batch: $($_.Exception.Message)"
        if ($_.Exception.Response) {
            $reader = New-Object System.IO.StreamReader($_.Exception.Response.GetResponseStream())
            $responseBody = $reader.ReadToEnd()
            Write-Error "Response: $responseBody"
        }
        throw
    }
}

# Function to upload JSON to Sentinel
function Upload-JsonToSentinel {
    Write-Host "Starting upload to Microsoft Sentinel..." -ForegroundColor Cyan
    
    # Check if JSON file exists
    if (-not (Test-Path $InputJsonPath)) {
        throw "JSON file not found: $InputJsonPath"
    }
    
    # Get authentication token
    $accessToken = Get-AuthToken
    
    # Read JSON file
    Write-Host "Reading JSON file: $InputJsonPath" -ForegroundColor Yellow
    $jsonContent = Get-Content $InputJsonPath -Raw
    $parsedLogs = $jsonContent | ConvertFrom-Json
    
    Write-Host "Loaded $($parsedLogs.Count) log entries from JSON" -ForegroundColor Green
    
    # Process logs in batches
    $totalProcessed = 0
    $totalBatches = [Math]::Ceiling($parsedLogs.Count / $BatchSize)
    $currentBatch = 0
    
    for ($i = 0; $i -lt $parsedLogs.Count; $i += $BatchSize) {
        $currentBatch++
        $endIndex = [Math]::Min($i + $BatchSize - 1, $parsedLogs.Count - 1)
        $batch = $parsedLogs[$i..$endIndex]
        
        Write-Host "Sending batch $currentBatch of $totalBatches ($($batch.Count) entries)..." -ForegroundColor Yellow
        
        # Send batch to Sentinel
        $response = Send-DataToSentinel -Data $batch -AccessToken $accessToken
        
        $totalProcessed += $batch.Count
        Write-Host "Batch $currentBatch sent successfully" -ForegroundColor Green
        
        # Small delay between batches to avoid throttling
        if ($currentBatch -lt $totalBatches) {
            Start-Sleep -Milliseconds 500
        }
    }
    
    Write-Host "Upload completed successfully!" -ForegroundColor Green
    Write-Host "Total entries uploaded: $totalProcessed" -ForegroundColor Green
    Write-Host "Total batches sent: $currentBatch" -ForegroundColor Green
}

# Main execution
try {
    Write-Host "JSON Upload to Sentinel Script" -ForegroundColor Cyan
    Write-Host "==============================" -ForegroundColor Cyan
    Write-Host ""
    
    Upload-JsonToSentinel
    
    Write-Host ""
    Write-Host "Upload Summary:" -ForegroundColor Green
    Write-Host "- JSON file: $InputJsonPath" -ForegroundColor White
    Write-Host "- Batch size: $BatchSize entries" -ForegroundColor White
    Write-Host "- Target: Microsoft Sentinel" -ForegroundColor White
}
catch {
    Write-Host ""
    Write-Host "Error occurred:" -ForegroundColor Red
    Write-Host $_.Exception.Message -ForegroundColor Red
    exit 1
}

Run this script after replacing the configuration values with your own. If successful, you should see your data appear in the custom table within a few minutes.

You can verify by running this KQL query in your Log Analytics Workspace:

CustomApacheLogs_CL
| take 10

Building the Azure DevOps Pipeline

Now that we’ve tested the API manually, let’s automate the entire process with Azure DevOps. The pipeline will have two jobs: one to convert the logs to JSON, and another to upload them to Sentinel.

Prerequisites

Before creating the pipeline, make sure you have:

  • An Azure DevOps organization and project
  • Your Apache log files ready to be added to the repository
  • The Python script from Part 1 (saved as convert-to-json.py)

Setting Up Variable Groups

Variable groups allow us to securely store sensitive information like client secrets.

  • In Azure DevOps, go to your project
  • Click PipelinesLibrary  Library
  • Click + Variable group
  • Name it: Custom Log Ingestion Variable Group
  • Add the following variables:
Variable Name Value Secure
TenantId Your tenant ID No
ClientId Your app registration client ID No
ClientSecret Your client secret Yes (click the lock icon)
DcrImmutableId Your DCR immutable ID No
DceEndpoint Your DCE endpoint URL No
StreamName Your table name (e.g., CustomApacheLogs_CL) No

 Create Variable Group

  • Click Save

Creating the Repository Structure

In your Azure DevOps repository, create the following structure:

1
2
3
4
5
├── azure-pipelines.yml
├── convert-to-json.py
├── apache.log
└── README.md

  • azure-pipelines.yml - the pipeline definition (we’ll create this next)
  • convert-to-json.py - the Python script from Part 1
  • apache.log - your Apache log file to be ingested

Creating the Pipeline.

  • In Azure Devops,navigate to your project and click pipelines in the left pane.
  • Click New Pipeline and Select Azure Repos Git  Creating Pipeline

  • Select your repo name
  • Select Starter Pipeline  Creating Pipeline

  • In following screen, you should see a prepopulated pipeline Yaml file. Select and clear/delete all content from the code pane.
  • Paste the content of the following yaml file in the empty code pand and
  • Click Save

This should create a file named azure-pipelines.yml in the root of your repository.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
trigger:
- main

variables:
- group: Custom Log Ingestion Variable Group

jobs:
- job: ConvertApacheLog
  displayName: 'Convert Apache Log to JSON'
  pool:
    vmImage: ubuntu-latest
  steps:
  - task: PythonScript@0
    displayName: 'Run Apache log parser'
    inputs:
      scriptSource: 'filePath'
      scriptPath: 'convert-to-json.py'

  - task: PublishBuildArtifacts@1
    displayName: 'Publish parsed logs'
    inputs:
      PathtoPublish: '$(Build.ArtifactStagingDirectory)'
      ArtifactName: 'parsed_logs'
      publishLocation: 'Container'

- job: Upload_to_Sentinel
  displayName: 'Upload to Sentinel'
  dependsOn: 'ConvertApacheLog'
  pool:
    vmImage: windows-latest
  steps:
  - task: DownloadBuildArtifacts@1
    displayName: 'Download build artifact'    
    inputs:
      buildType: 'current'
      downloadType: 'single'
      artifactName: 'parsed_logs'
      downloadPath: '$(System.ArtifactsDirectory)'

  - task: PowerShell@2
    displayName: 'Upload JSON to Sentinel'
    inputs:
      targetType: 'inline'
      script: |
        # Configuration - Get from environment variables (set in pipeline)
        $TenantId = $env:TENANT_ID
        $ClientId = $env:CLIENT_ID
        $ClientSecret = $env:CLIENT_SECRET
        $DcrImmutableId = $env:DCR_IMMUTABLE_ID
        $DceEndpoint = $env:DCE_ENDPOINT
        $StreamName = $env:STREAM_NAME
        $BatchSize = 100

        # Set working directory to downloaded artifact
        Set-Location "$(System.ArtifactsDirectory)/parsed_logs"

        # Validate required variables
        $requiredVars = @{
            'TENANT_ID' = $TenantId
            'CLIENT_ID' = $ClientId
            'CLIENT_SECRET' = $ClientSecret
            'DCR_IMMUTABLE_ID' = $DcrImmutableId
            'DCE_ENDPOINT' = $DceEndpoint
            'STREAM_NAME' = $StreamName
        }

        $missingVars = @()
        foreach ($var in $requiredVars.Keys) {
            if ([string]::IsNullOrEmpty($requiredVars[$var])) {
                $missingVars += $var
            }
        }

        if ($missingVars.Count -gt 0) {
            Write-Error "Missing required environment variables: $($missingVars -join ', ')"
            exit 1
        }

        # Find JSON file
        $InputJsonPath = Get-ChildItem -Filter "*.json" | Select-Object -First 1 -ExpandProperty Name

        if (-not $InputJsonPath) {
            throw "No JSON file found in artifact directory"
        }

        # Function to get OAuth token
        function Get-AuthToken {
            try {
                $body = @{
                    grant_type    = "client_credentials"
                    client_id     = $ClientId
                    client_secret = $ClientSecret
                    scope         = "https://monitor.azure.com/.default"
                }
                
                $uri = "https://login.microsoftonline.com/$TenantId/oauth2/v2.0/token"
                $response = Invoke-RestMethod -Uri $uri -Method Post -Body $body
                
                return $response.access_token
            }
            catch {
                Write-Error "Failed to get authentication token: $($_.Exception.Message)"
                throw
            }
        }

        # Function to send data to Sentinel
        function Send-DataToSentinel {
            param([array]$Data, [string]$AccessToken)
            
            try {
                $headers = @{
                    'Authorization' = "Bearer $AccessToken"
                    'Content-Type'  = 'application/json'
                }
                
                $uri = "$DceEndpoint/dataCollectionRules/$DcrImmutableId/streams/Custom-$StreamName" + "?api-version=2023-01-01"
                $jsonPayload = $Data | ConvertTo-Json -Depth 10 -Compress
                
                $response = Invoke-RestMethod -Uri $uri -Method Post -Body $jsonPayload -Headers $headers
                return $response
            }
            catch {
                Write-Error "Failed to send data batch: $($_.Exception.Message)"
                throw
            }
        }

        # Main execution
        Write-Host "Starting upload to Microsoft Sentinel..." -ForegroundColor Cyan
        
        # Get auth token
        $accessToken = Get-AuthToken
        
        # Read and parse JSON
        Write-Host "Reading JSON file: $InputJsonPath" -ForegroundColor Yellow
        $jsonContent = Get-Content $InputJsonPath -Raw
        $parsedLogs = $jsonContent | ConvertFrom-Json
        
        Write-Host "Loaded $($parsedLogs.Count) log entries" -ForegroundColor Green
        
        # Process in batches
        $totalProcessed = 0
        $totalBatches = [Math]::Ceiling($parsedLogs.Count / $BatchSize)
        
        for ($i = 0; $i -lt $parsedLogs.Count; $i += $BatchSize) {
            $currentBatch = [Math]::Floor($i / $BatchSize) + 1
            $endIndex = [Math]::Min($i + $BatchSize - 1, $parsedLogs.Count - 1)
            $batch = $parsedLogs[$i..$endIndex]
            
            Write-Host "Sending batch $currentBatch of $totalBatches ($($batch.Count) entries)..." -ForegroundColor Yellow
            
            Send-DataToSentinel -Data $batch -AccessToken $accessToken
            
            $totalProcessed += $batch.Count
            Write-Host "Batch $currentBatch sent successfully" -ForegroundColor Green
            
            # Small delay to avoid throttling
            if ($currentBatch -lt $totalBatches) {
                Start-Sleep -Milliseconds 500
            }
        }
        
        Write-Host "Upload completed! Total entries: $totalProcessed" -ForegroundColor Green
    env:
      TENANT_ID: $(TenantId)
      CLIENT_ID: $(ClientId) 
      CLIENT_SECRET: $(ClientSecret)
      DCR_IMMUTABLE_ID: $(DcrImmutableId)
      DCE_ENDPOINT: $(DceEndpoint)
      STREAM_NAME: $(StreamName)

Understanding the Pipeline

Let me break down what’s happening:

Job 1: ConvertApacheLog

  • Runs on Ubuntu
  • Executes the Python script to parse Apache logs
  • Publishes the resulting JSON file as a build artifact

Job 2: Upload_to_Sentinel

  • Depends on Job 1 completing successfully
  • Runs on Windows (for PowerShell compatibility)
  • Downloads the JSON artifact from Job 1
  • Authenticates with Azure using the service principal
  • Uploads the data to Sentinel in batches of 100 entries
  • Includes validation and error handling

The batching approach is important because sending thousands of log entries in a single API call can hit size limits or cause timeouts. By processing in batches of 100, we ensure reliable uploads even for large log files.

Running the Pipeline for the first time.

  • In Azure DevOps, go to Pipelines
  • Click New pipeline (or Create pipeline)
  • Select Azure Repos Git
  • Choose your repository
  • Select Existing Azure Pipelines YAML file
  • Choose /azure-pipelines.yml
  • Click Continue, then Run

The pipeline will execute both jobs sequentially. You can monitor progress in the pipeline run view. If something fails, click on the failed step to see detailed logs.

Verifying the Data in Sentinel

Once the pipeline completes successfully, give it a few minutes for the data to be indexed in Sentinel. Then:

  • Navigate to your Log Analytics Workspace
  • Click Logs in the left pane
  • Run this query:
CustomApacheLogs_CL
| order by TimeGenerated desc
| take 10

You should see your Apache log entries with all the parsed fields.

Subsequent Data Uploads

Once the pipeline is set up, uploading additional data becomes straightforward. Simply update the apache.log file in your repository by committing a new version. This will automatically trigger the pipeline, which will convert the logs to JSON format in Job 1 and upload them to Sentinel in Job 2.

This automation is particularly valuable during incident response scenarios. It allows any team member with repository access to upload logs for analysis, even if they don’t have permissions to create Azure resources. Instead of building custom scripts or requesting access to Azure, analysts can simply commit their log files and let the pipeline handle the rest. This streamlines adhoc data ingestion and ensures consistent processing across your team.

This post is licensed under CC BY 4.0 by the author.