How we helped a fintech client leverage Azure storage

for Long-Term SQL Server Backup, Reduced Costs (Threefold) & Improved Security

Case Study Overview

Our Company: SuperAdmins
Client Industry: FinTech, Money Transfer & Transactions
Goal: Cloud Storage Cost Reduction & Improved Data Security
Challenges: The nomenclature-based problem with file extensions
Services Provided: Microsoft Azure, Powershell Script Coding, Azure Functions
Project Duration: 3 days (2 of which took project-specific custom coding)
Results: Data storage costs reduced to a minimum (monthly costs decreased threefold), Higher data security

Introduction

Many software and applications which reside in on-premise infrastructures, such as Microsoft SQL Server, can save their backups directly to Azure Storage. Azure Blob Storage, like object storage, provides a solution for storing massive amounts of unstructured data that doesn’t adhere to a particular data model or definition. At the same time, Azure Blob Storage gives us the ability to optimize expenses by using an appropriate storage tier.

Most of our clients, especially those with on-premise infrastructure, have already adopted the philosophy of cloud storage for their backups. Some of them (due to industry-specific needs) must have offsite and long-term backups for compliance, whereas others just want to make sure their data is secure in the case of ransomware. However, both client types decide to use Azure Blob Storage.

Our client, a company operating in the FinTech industry, reached us to see if there’s any room for improvement in terms of data resiliency and storage costs. The client has dozens of Microsoft SQL Servers with millions of transactions daily, and all of these servers must be adequately backed up due to legal regulations. Moreover, all of those SQL Server backups, as well as the transaction log backups, must be stored and available for at least a year, due to the nature of their industry.

They already had implemented a procedure for shipping backups to the Azure Blob Storage, but they were not “happy” with costs.

Goals

Optimize the client’s Azure storage costs and improve data security, without jeopardizing data accessibility.

Our Process of Azure Storage Cost Optimization

Step 1: Evaluation

We first analyzed their system and after the initial assessment of their backup data flow, we realized their storage account was quite heavy with data: approximately 6 TB of data monthly being shipped to the Azure Blob Storage.

The biggest issue was that all their data was stored in the Hot tier – a storage level optimized for frequent access of objects in the storage account. The Hot tier is also the second most expensive Azure storage tier that costs $18 per terabyte. (More information on Azure storage tiers available here.)

After a year of using this approach, the client had approximately 40 TB of data in the cloud, accounting for their monthly storage costs being around 230% over the optimal number. Additionally, as the amount of data grows every month, the client’s bill for used storage is also increasing.

These numbers were unnecessarily high since they didn’t need all their data being stored in Azure’s Hot tier, as they have certain data types that don’t have to be accessible immediately.

Step 2: Implementation

Because the client implemented the cloud storage for offsite long-term backup and compliance regulations, while having local backups for the quicky restore if necessary, we decided to create a retention policy and a strategy that would significantly reduce their monthly storage costs.

We agreed to have their data stored in the Hot tier for 14 days, after which the data was transitioned to the Archive tier where it would be stored for a year. In most cases, they don’t need to retrieve data older than 7-10 days, which they have stored on-premise.

After further analysis, we realized that we cannot use the standard, embedded Lifecycle Policy, which is where the custom Powershell scripts came in to optimize the process in terms of speed and consistency. This was a relatively unique issue as the nomenclature is most typically done correctly, but it had to be taken care of. So, we resorted to tackling the problem through custom scripts dealing with the extensions, and – in order to automate the process – we had to put it in Azure Functions that performs the task every morning.

As far as the Archive tier limitation is concerned – where data needs to be stored for at least 180 days – this wasn’t a problem because our client needs to keep the data for a year. Another limitation of the Archive tier is the time needed to access data as the stored files are offline and need more time (and costs) to retrieve it. This also wasn’t a problem for our client’s workflow and backup needs, which is why the implementation of the storage Lifecycle Policy that moves all the data older than 14 days to the Archive Tier is the optimal one.

Step 3: Verification

As the project was somewhat unique due to the necessity of custom scripts, we had to make sure all the moving parts were working properly and that the client’s new long-term SQL server backup system is indeed optimized.

Results

Our client now stores a portion of their data in the Hot tier for fast accessibility reasons, while the files that don’t need to be accessed instantaneously are being stored in the Archive tier. This cost optimization process resulted in reducing the client’s Microsoft Azure storage costs by a threefold.

There are also long-term cost benefits as the client’s monthly storage costs won’t increase by $18 per added terabyte, but only $1. This is made possible due to the optimization of storage costs in terms of utilizing the right storage tier for the right data type.

This approach is beneficial in terms of data security as well since your files are encrypted and stored in the cloud, as opposed to using the local storage account where the files are being stored in 3 different locations within the same region, or the geo-redundant storage account where the files are spread over 6 different locations (3 in primary, 3 in secondary), etc.

So, aside from the dramatic reduction in Azure costs, the long-term SQL server backup approach also provides added data security against both malware and physical server damage. For reference, we once had a client that didn’t opt for this storage strategy and decided to go with physical servers, causing them to lose crucial data and then subsequently two major clients of their own due to not having off-site long-term storage backup (their longest backup was only 12 months long, so all the files that were older than that were forever lost).

This is why we recommend long-term SQL server backup and the right data storage strategy to most of our clients with similar data access/usage/storage structures. This level of cost optimization and data security simply wouldn’t be possible without leveraging all the benefits of Azure’s storage features and plans.

Challenges

CHALLENGE 1

The client has several database types: the one that needs to be fully backed up daily, the one that is backed up monthly, the differential backups on ‘regular’ days, as well as transaction logs (anywhere from 5 to 60 minutes), accounting for a very diverse database workflow.

The main challenge involved the relationship between the Lifecycle Policy and blob extensions. The policy treats all the single files equally, with the timestamp being the only relevant parameter in the majority of cases. Now, as the process of backing up files involves using different extensions:

  • .bak for a full backup
  • .dif for differential
  • .trn for transaction log

…and since certain files were assigned different extensions, the issue emerged with setting up the .trn files to be deleted after 14 days, and with setting .bak and .diff files to be stored for an indefinite amount of time. For example, their differential backup was using the .bak  extension, which isn’t preventing that particular database from working properly but is creating issues later on where the Lifecycle Policy would end up deleting either all of the files or none of them.

This resulted in us having to create custom-made PowerShell scripts that are run via Azure Functions, a process that is rather time-consuming and intricate by nature.