Azure SQL Database and Blockchain ledger...whaaaaat?
Updated: May 25, 2022
This is pretty amazing, but before I go into why, let’s just start with the basics - what is blockchain, what is Azure SQL Database, and why is the latter leveraging some of the concepts of the former such a compelling story.
I will also later in this article look at some of the advantages the Azure SQL Database ledger functionality could bring,any alternative ledger databases, and conclude with a simple step by step test drive.
Update 25 May 2022 - Azure SQL Database ledger functionality is not generally available - https://techcommunity.microsoft.com/t5/azure-sql-blog/ledger-now-generally-available-in-azure-sql-database/ba-p/3379658
Blockchain is basically a system that allows for the recording of information in a way that makes it very difficult to change. The blockchain consists of many computers (nodes) and each new transaction is distributed and then duplicated in blocks in copies of a ledger across each of these nodes.
Each transaction forms part of a block and before a new block can be added, its authenticity must be verified through a validation or consensus process which requires a majority of nodes in the network to agree that the new block's hash has been calculated correctly. A hash is essentially an alphanumeric string of the block's timestamp. After the first block has been created, each subsequent block's hash is calculated based on the preceding block's hash.
Once a block is added, it can be referenced in subsequent blocks, but it cannot be changed. This is because if there is an attempt to change a block, the hashes for upstream and downstream blocks will change and disrupt the ledger's shared state as consensus will no longer be possible, and no new blocks will be added to the chain until the problem is solved. This can be done by discarding the offending block and the consensus process will be repeated.
Azure SQL Database is of course simply the relational database we all know and love in a platform as a service configuration. The difference between Blockchain and a Database is primarily one of centralisation. In a database, records are centralised whereas in a Blockchain each node has a secured copy of all records and all changes, and where these is inconsistency, this will immediately be identified and corrected any unreliable information.
Imagine a scenario where two businesses work together, think of supply chain. They will almost never share a single database and a single set of records. But having two databases lead to compromised trust in the data and both databases constitute points of failure. With Blockchain a single point of failure is removed, insofar as if one participant makes a change it is immediately validated and visible by the other participant. If Qantas transports a parcel between the USA and Australia on behalf of DHL and if Qantas and DHL took part in a Blockchain, then Qantas can trust that DHL has made payment for the transport and DHL can trust that Qantas transported it.
So, is Azure SQL Database becoming a Blockchain?
Not quite, but it now has a ledger feature* that provides tamper-evidence capabilities similar to Blockchain. And this will make some people very happy :)
Security of the data in your database is obviously often very important, especially if you store sensitive data. So, it is a VERY good idea to put frameworks and controls in place to protect this data. But I have always wondered, how do you protect your data from your own high privileged users, such as DBA's, system administrators, Azure Service Administrators, etc. Even the best policies, role-based access control, and scaffolding cannot protect you from nefarious activity.
Yet with a ledger, you can create trust around the integrity of data stored in database by cryptographically attesting that your data hasn't been tampered with. The new ledger feature preserves historical data, so that if a row is updated in the database, its previous value is maintained and protected in a history table and therefore chronicles all the changes made to the database over time.
Also, a ledger and the historical data are managed transparently which provides the security but in a way that is abstracted from any application changes. Historical data is maintained in relational form which means it can also support SQL queries for auditing, forensics, and other purposes.
How is this similar to Blockchain?
Similar to Blockchain, each transaction in a ledger enabled Azure SQL Database is cryptographically hashed (a Secure Hash Algorithm 256)**. The hash function uses the value of the transaction, along with the hash of the previous transaction as input to the hash function links all transactions together, like a blockchain. The hash of the latest block in the database ledger is called the database digest and represents the state of all ledger tables in the database at the time that the block was generated. The database digest is outside Azure SQL Database in either Azure Confidential Ledger or in Azure Blob Storage in a configurable Write Once, Read Many/ WORM state.
In a nutshell, in a ledger database, a layer of digital signatures for each transaction is added so authorised persons can audit the list and see that it was constructed correctly.
There are two types of ledger functionality for Azure SQL Database:
Updatable ledger tables, which allow you to update and delete rows in your tables and is ideal for applications that issue updates and deletions to tables in your database, such as system of record (SOR) applications.
Append-only ledger tables, which only allow insertions to your tables are ideal for application that issue inserts only, such as security information and event management (SIEM) applications.
Why is Azure SQL Database ledger functionality useful?
It can streamline auditing and associated costs - maintaining trust in your data requires a combination of activities, including security controls, backup, disaster recovery and audits by external parties to ensure that these practices are appropriate. These audits are time-intensive and costly and includes reviewing audit logs, inspecting authentication, and inspecting access controls. They are manual and can therefore not 100% accurate. Ledger functionality on the other hand provides proof of data integrity because the retained historical data supports SQL queries for auditing, forensics, and other purposes.
It supports multiple-party networks centrally, for example in our Qantas and DHL supply chain example - Blockchain is of course what businesses turn to where trust is required between multiple parties even though many parts of these networks are fundamentally centralized so a Blockchain fully decentralized infrastructure feels like overkill. Azure SQL Database with ledger functionality can provide a lighter weight alternative by allowing parties to verify the integrity of the centrally housed data, without the complexity and performance implications of a Blockchain network.
And my personal favourite, data versioning, that old business intelligence chestnut - in a ledger database an UPDATE does not update, but rather creates a new version of the record, leaving the old version unchanged, thereby combining the traditional database model with the GIT versioning model. Imagine a scenario where you need to track how the customer changes over time, the ledger database will track address changes and you won’t need to perform extensive table joins like you would have in a non-ledger deployment.
Are there alternatives?
Yup, there is also Amazon Quantum Ledger Database (QLDB)**** which supports PartiQL which is an open source, SQL-compatible query language designed to easily work with all data types and structures. This may not concern many, but for those who want to remain within the traditional SQL realm, Azure SQL Database may be a more suitable option - but I must reiterate that this will depend on personal preferences.
Simple test drive
Let's look at a simple example whereby we provision Azure SQL Database with the ledger feature enabled. Then write a few records to the database. Then change a couple of them. And lastly look at the result.
A) I firstly provisioned a new Azure SQL Database as per normal (note that I opted for vCore Serverless***** simply as a cost savings measure and because I wanted to see if Azure SQL Database's ledger feature is available in the vCore Serverless deployment mode). This simply required a few easy settings in the Security tab and in this test, I opted for Azure Storage rather than Azure Confidential Ledger for the same of simplicity.
B) I then created a simple table called Customers, with the required system versioning and ledger settings enabled:
CREATE TABLE CUSTOMERS( ID INT NOT NULL PRIMARY KEY CLUSTERED, NAME VARCHAR (20) NOT NULL, AGE INT NOT NULL ) WITH ( SYSTEM_VERSIONING = ON, LEDGER = ON ); GO
IMPORTANT: This query created the table CUSTOMERS, but also automatically two additional objects: a ledger table called MSSQL_LedgerHistoryFor_1845581613 and a ledger view called CUSTOMERS_Ledger.
C) I then populated the table with a few simple records:
INSERT INTO CUSTOMERS (ID, NAME, AGE) VALUES (1, 'JOHN', 28), (2, 'SARAH', 32), (3, 'PAUL', 54)
Navigating to the new Azure Storage shows that a Container, sqldbledgerdigests was created automatically. The expected Storage Account > Container > Folder structure is shown below:
D) Let's assume Paul's age was entered incorrectly as 54, in stead of 45, I used an UPDATE statement to correct the record:
UPDATE CUSTOMERS SET AGE = 45 WHERE ID = 3;
D) I then inserted a new record:
INSERT INTO CUSTOMERS (ID, NAME, AGE) VALUES (4, 'DANIEL', 60)
E) Let's assume Daniels age too was entered incorrectly as 60, in stead of 50, I used a UPDATE statement to correct the record:
UPDATE CUSTOMERS SET AGE = 50 WHERE ID = 4;
F) The following three queries shows very nicely how the changes I described above (I made a few additional ones) manifest itself across the two tables and one view.
Customer table (note how the 4 x GENERATED ALWAYS columns are now part of the Customer table even thought it was not part of the actual table create structure):
Customer Ledger Table:
SELECT * FROM [dbo].[MSSQL_LedgerHistoryFor_1845581613]
Customer ledger view:
SELECT * FROM [dbo].[CUSTOMERS_Ledger]
ORDER BY ledger_transaction_id
G) Result - the four original records, the ledger records and the audit of all the changes is shown below.
H) In addition, the underlying ledger block, hash and date time data can be seen in the File stored in the Storage Account.
The simplicity with which, what was previously quite complicated to achieve, is very compelling. The business benefit are numerous, especially for me, who often look at the word through a data analytics lens. Not only do you enhance the trust of the data due to the preservation of historical data, but you, at the same time, achieve data versioning which is a very useful tool when time travelling, i.e. Query data in the past that has since been updated or deleted. The fact that I could achieve this with vCore Serverless also adds potential cost savings benefit.
* as at the date of first publish of this blog post, the ledger feature of Azure SQL Database was still in preview and only available in some regions.
** SHA256 algorithm generates an almost-unique, fixed size 256-bit (32-byte) hash - see SHA-256 hash calculator | Xorbin
*** general references - Azure SQL Database ledger overview - Azure SQL Database | Microsoft Docs, Overview of immutable storage for blob data - Azure Storage | Microsoft Docs, Limitations for Azure SQL Database ledger - Azure SQL Database | Microsoft Docs
**** Amazon QLDB,
The views expressed on this post are mine and do not necessarily reflect the views of any organisation I am associated with.