Let’s discuss Azure Application Insights and the simple optimization we can do to avoid surprise bills.

Azure Application Insights can be used to monitor and check our applications’ health, analyze logs and other metrics related to our application, and view other resources which are included in our Azure subscription. With a variety of features, many teams may have different objectives when using Application Insights. Although we might not use all the features that are available with Application Insights, at a minimum we use them for logging.

If you ask me, “Logging is an art.”

In order for application teams to understand issues that happen in Production, how much logged information is needed? There is no single formula or concrete definition, but at a minimum, a log should be able to tell the operation, action, client details consuming the operation, and other request and response information with critical information masked. If we have this information, application teams should be able to replicate the issues in their test environments.

Many times, the logs give an understanding of what would have happened when the execution failed. There is no point if we just log exceptions with huge stack trace without supporting information on what happened before exception.

In general, it is good to use different logging strategies based on the type of log.

Are we logging to capture crucial states of transaction?
Are we logging to capture transient errors like connectivity issues, authentication issues?
Are we logging information to capture application workflow steps?

Logging is often overlooked when we perform application migrations to Cloud. The next time when we are trying to perform an application migration it is good to think about Logging strategy that is in place.

For example, consider the following scenario. Let us say we have a job which processes information when records appear in the database. In general, the job would be polling the database for every 5 minutes to look for data which satisfies a specific condition. When new records are present, the job would go ahead and pull the new data and start processing. In this process the job would emit information logs which is stored in filesystem as text files or stored in database. When all the process is completed, the job would update the status of the records in database so that these records are not considered in the next iteration. If all goes well this approach works well.

Now when we shift this application to Cloud and start using Azure Application Insights these logs would be captured in the Application Insights. Now consider a scenario where the jobs fail to process the data. For example, let us assume there was bad data. The job in every run would be failing and would not be able to update the status of the records to complete. In every run the job will be trying to process the bad data. The job would be emitting logs with the error message continuously for every run.

We know that the job polls for new data every 5 minutes. These errors eventually land up in our Application Insights. So let us say there are around 10 records with bad data and every 5 minutes the job would be emitting logs with errors and other information logs in our Application Insights which are captured using the AppInsights SDK telemetry option.

As per Azure Monitor Pricing “Log Analytics and Application Insights charge for data they ingest.”

Eventually our Application Insights will be ingesting lots of exception messages and there will be lot of noise. If unnoticed these logs fill the Application Insights, and we would be ending up paying up for the noise.

So how do we avoid situations like these?

In order to avoid situations like these it is always good to implement a Circuit Breaker Pattern along with retry option. Let us say we set the threshold for retry pattern as 3. When our jobs try to process records and if they fail continuously for 3 times, the job will mark the records as failed in the subsequent runs. This way we can avoid sending data to our Application Insights and we could save amounts that would reflect huge savings eventually.

Performing small optimizations like these we will not see the impact immediately, but overall, we will realize the benefits.

What are few optimizations you have tried to save costs with your applications in Azure? Feel free to comment.