Cleanup blob storage

This time I want to share a small script that I have created and for which I realized it can be usefull in more scenarios. Sometimes I see blobstorages growing at customers side, with files not in use anymore. It can be the result of a job, or because the storage is used by a couple of developers or because Develop and Test are using the same storage, or… On the other hand there is a "known issue" at Episerver side that it sometimes (or always?) doesn’t remove the actual blob when removing the media item via code (read "ContentRepository.Delete(..)")(1). For all these cases I have created this small script. Which basically does only copy the blobs which do exist in the Episerver DB to a brand new clean container.

Why not just cleaning the the current storage?

The answer on that is very easy. I found it to risky to remove stuff on an existing storage. I want to create a "backup" situation first and then move on from there.

What steps do I need to make?

  1. First create a new container "mysitemedia" or whatever you use in your Episerver installation. (mysitemedia is default and is used by the script as well)
  2. Get and set the right connectionstrings for the storage account for both the source (connBlobStorage) and the destination (destConnBlobStorage).
  3. Run your script with LinqPad and make sure you set the connectionstring of the script to your Episerver database of choice.
// https://docs.microsoft.com/en-us/learn/modules/copy-blobs-from-command-line-and-code/7-move-blobs-using-net-storage-client

var connBlobStorage =  "DefaultEndpointsProtocol=https;AccountName=...;AccountKey=...";
var destConnBlobStorage = "DefaultEndpointsProtocol=https;AccountName=...;AccountKey=...";

$"Job started at  {DateTime.Now.ToShortDateString()} {DateTime.Now.TimeOfDay}".Dump();

CloudStorageAccount storageAccount;
CloudStorageAccount destStorageAccount;

// Check whether the connection string can be parsed.
if (CloudStorageAccount.TryParse(connBlobStorage, out storageAccount) && CloudStorageAccount.TryParse(destConnBlobStorage, out destStorageAccount))
{
	#region Logging

	var itemsCopied = 0;
	DumpContainer dcFile = new DumpContainer();
	DumpContainer dcTotals = new DumpContainer();
	var notFound = new List<string>();

	#endregion
	
	#region Clients and Containers
	
	var sourceClient = storageAccount.CreateCloudBlobClient();
	var sourceContainer = sourceClient.GetContainerReference("mysitemedia");

	var destClient = destStorageAccount.CreateCloudBlobClient();
	var destContainer = destClient.GetContainerReference("mysitemedia");
	
	#endregion

	#region Helpers
	
	Func<string, CloudBlobContainer, string> getSharedAccessUri = (string blobName, CloudBlobContainer container) =>
	{
		DateTime toDateTime = DateTime.Now.AddMinutes(60);

		SharedAccessBlobPolicy policy = new SharedAccessBlobPolicy
		{
			Permissions = SharedAccessBlobPermissions.Read,
			SharedAccessStartTime = null,
			SharedAccessExpiryTime = new DateTimeOffset(toDateTime)
		};

		CloudBlockBlob blob = container.GetBlockBlobReference(blobName);
		string sas = blob.GetSharedAccessSignature(policy);

		return blob.Uri.AbsoluteUri + sas;
	};

	Func<ICloudBlob, CloudBlobContainer, bool> copyBlob = (ICloudBlob source, CloudBlobContainer destination) =>
	{
		if (source != null)
		{
			var destBlob = destination.GetBlockBlobReference(source.Name);
			return !string.IsNullOrWhiteSpace(destBlob.StartCopy(new Uri(getSharedAccessUri(source.Name, source.Container))));
		}
		
		return false;
	};
	
	#endregion

	var dbBlobs = TblContentLanguages
		.Where(tcl => tcl.BlobUri != null)
		.Select(tcl =>
			tcl.BlobUri.Replace("epi.fx.blob://default/", string.Empty)
		).ToList();
	
	$"{ dbBlobs.Count() } items found in Episerver.".Dump();
	
	dcFile.Dump();
	dcTotals.Dump();

	foreach(var blobname in dbBlobs)
	{
		dcFile.Content = $"Start copying file: {blobname}";

		try
		{
			if (copyBlob(sourceContainer.GetBlobReferenceFromServer(blobname), destContainer))
				itemsCopied++;
		}
		catch(StorageException ex)
		{
			if(ex.Message.Contains("not exist"))
			{
				notFound.Add(blobname);
			}
			else
			{
				$"unknown exception for '{blobname}'".Dump();
			}
			
		}
			
		dcTotals.Content = $"{itemsCopied} items found on 'source'storage and copied to destination.";
	}
	
	"Files not found on the source are in this list:".Dump();
	notFound.Dump();
}

(1) – In this case you can also force a file delete via the blobfactory.

%d bloggers like this: