Using Blob Snapshots to Backup Azure Virtual Machines

Now that Windows Azure Virtual Machines are out of preview it seems like a good time to look at how we can safeguard against the inevitable disasters that will befall those VMs.

Azure Virtual Machines use blob storage for both OS and data disks so we can get some basic backups going with nothing more than the blob API and a simple powershell script.

Setting Up the API

Before we can start writing anything to make use of the blob API we need to make sure that we have it downloaded and configured. How, you ask?

  • Firstly, install Windows Azure Powershell from here or through the Web Platform Installer
  • Next, create a new powershell script and import the API namespace
  Add-Type -Path "C:\Program Files\Microsoft SDKs\Windows Azure\.NET SDK\2012-10\bin\Microsoft.WindowsAzure.StorageClient.dll"

Setting Up Your Subscription

You will also need to configure powershell to use your Windows Azure account. Here’s how (as described by the excellent Michael Washam):

  • Download your Azure publish profile through the portal or with this link
  • Import the downloaded settings file using the Import-AzurePublishSettingsFile cmdlet
  Import-AzurePublishSettingsFile "c:\downloads\subscription.publishsettings"
  • Finally, set your subscription as active
  Set-AzureSubscription -SubscriptionName "[subscription name]"

At this point you are able to start using the API with your Azure account.

The CloudBlob Class

Most of the operations we want to invoke for our backup are going to be called on the CloudBlob class, which represents both the original blob and any snapshots that are acquired from Azure.

Finding the Right Blob

The first thing that we need to do is get a reference to the blob that contains the virtual hard disk (vhd) for our virtual machine, and to do that we need to dig through the portal.

Browse to the Dashboard for your virtual machine and you will see something like:

2013-04-24 18_10_25-Virtual machines - Windows Azure

Make a note of the disk name (highlighted) then browse to Virtual Machines > Disks. This will display a table listing all the disks that are being used by virtual machines, to which VMs they are attached and their location. From the table, locate the disk with the matching name and grab the location URL, which will look something like this:

http://storageaccountname.blob.core.windows.net/vhds/diskblobname.vhd

Make a note of both the storage account and disk blob names, and we have enough information to identify the correct blob.

Getting a Blob Reference

We now need to grab an instance of CloudBlob that represents our VM disk. We do this using the CloudBlobClient.GetBlobReference method, building up the required credentials from the storage account details.

$storageAccount = "[storage account name]"
$blobPath = "vhds/[disk blob name.vhd]"

#get the storage account key
$key = (Get-AzureStorageKey -StorageAccountName $storageAccount).Primary

#generate credentials based on the key
$creds = New-Object Microsoft.WindowsAzure.StorageCredentialsAccountAndKey($storageAccount,$key)

#create an instance of the CloudBlobClient class
$blobClient = New-Object Microsoft.WindowsAzure.StorageClient.CloudBlobClient($blobUri, $creds)

#and grab a reference to our target blob
$blob = $blobClient.GetBlobReference($blobPath)

We now have a $blob variable that we can start using to manipulate the blob.

Creating Snapshots

Creating a new snapshot is an incredibly simple step - we can just call CreateSnapshot!

$snapshot = $blob.CreateSnapshot()

The $snapshot variable now contains another instance of CloudBlob that represents the snapshot, and we can use this to download the snapshot content at any point in the future.

That’s useful, but it’s not that useful as we’re unlikely to keep a reference to that object until the next time we need it. So how do we find snapshots that have been made in the past?

Finding Snapshots

The API includes a method on the CloudBlobContainer class that will list all blobs within a particular container (in this case, vhds). Unfortunately it does not do much in the way of filtering, so we need to add some code of our own.

#assume we can get our blob as before
$blob = Get-Blob

#we need to create an options object to pass to
#the ListBlobs method so that it includes snapshots
$options = New-Object Microsoft.WindowsAzure.StorageClient.BlobRequestOptions
$options.BlobListingDetails = "Snapshots"
$options.UseFlatBlobListing = $true
$snapshots = $blob.Container.ListBlobs($options);

#once we have the results we need to manually filter
#the ones we aren't interested in just now
foreach ($snapshot in $snapshots)
{
	#make sure that the blob URI ends in our blobPath variable from earlier
	if ($snapshot.Uri -notmatch ".+$blobPath$") { continue }

	#and make sure that the SnapshotTime is set - this filters out
	#the current live version of the blob
	if ($snapshot.SnapshotTime -eq $null) { continue }

	Write-Output $snapshot
}

If we wrap this up in a Get-Snapshots function then it will return each snapshot of our blob in date order.

Now that we can get a list of snapshots associated with a blob, we want to be able restore the “live” blob to a point in the past.

Restoring the Blob

Once we have a reference to both the current blob and the snapshot that we want to restore, it’s trivial to overwrite the current blob with the older version:

#grab the snapshots for the blob as described above
$allSnaphots = Get-Snapshots

#restore to original version
$blob.CopyFromBlob($allSnapshots[0])

#restore to most recent snapshot
$blob.CopyFromBlob($allSnapshots[$allSnapshots.Length-1])

We probably want to be a little bit more careful with this though, as you could inadvertently overwrite the current version and lose our data. To avoid this we can take another snapshot as a backup prior to restoring - this way we are in no risk of losing any data.

#grab the snapshots for the blob as described above
$allSnaphots = Get-Snapshots

#take a backup snapshot before restoring
$backupSnapshot = $blob.CreateSnapshot()

#and then restore safely
$blob.CopyFromBlob($allSnapshots[$allSnapshots.Length-1])

One important thing to note about this approach: it simply restores the disk. Understandably, Azure gets touchy about you overwriting the disk of a live machine, so you will have to make sure that the VM is shut down and disassociated with the disk before you can restore.

Obviously this is not a complete backup solution but it can quickly give you a means to recover your Azure virtual machines to an earlier point. Everything is in powershell so can be easily scheduled to run automatically at scheduled intervals, and the snapshots are fully accessible so a more in-depth backup process can pull them to another location as required.