Back in June 2013 I built an Azure architecture prototype that collected Twitter data, stored it, analysed it and presented the results through a web interface. It’s been running now for a number of months, not needing any attention and doing a great job. Now as every experienced IT Architect knows the true test of a good architecture comes when the client needs to make some changes: that time has arrived as I’m now looking at how to implement a more sophisticated analysis of the structure of a targeted sub-graph of tweeters.
Let me explain the parts I’ve evaluated: I’ve deliberately only used Platform-as-a-Service (PaaS) components and avoided VMs, i.e. Infrastructure-as-a-Service (IaaS), because I believe PaaS will give organisations the greatest impact when adopting Cloud (but that’s another topic). The specific components from my architecture prototype are listed below.
Think of these like services in windows; normally they sit in a loop picking up, and responding to, tasks. I’ve been impressed with how easy it is to create, configure and deploy worker roles although debugging has been harder. I’ve been aware that to perform more sophisticated analysis on Twitter a Graph Database would be ideal and I thought this would, inevitably, require using IaaS to install a VM. But no, I found a really fantastic example of what a worker role is capable of: it can be configured to connect to a VHD, dynamically install Java, install Java applications and then start the Java application; voila Neo4j running on a worker role! So worker roles – definitely make use of them.
Provides a basic mechanism to store structured data. It’s fast (once you understand how it is indexed), it appears to scale well (I’ve not pushed it that far though) and very, very low cost. However, there is a downside: Table Storage is nowhere near as flexible as a relational database, it does not support joins, you cannot add additional indexes, specify sort orders or group by clauses. None of these limitations are a problem if your requirements are straightforward and not going to change but when I look at how to get from what I’ve got to what I want it’s going to be a lot of work compared to a relational database. In conclusion look to use table storage where you want to store simple, discrete, entities, you don’t mind doing a bit of data manipulation in code (e.g. summarising, joining) and requirements change will be low. For my next Azure architecture prototype I will be using the Azures PaaS relational database: Windows Azure SQL Database.
Come in a number of flavours, the most basic of which is an instance in a multi-tenanted IIS server (and free!). My Websites were pretty basic: connect to Table Storage, drag back the data and display it on the webpage. Again the build, deploy and configure are all very smooth. I can’t say I really pushed Websites to the limit but I have seen other, more sophisticated, sites which seem to be just as easy to manage and perform very well. Azure Websites: yes, use them.
For storing files and other chunky bits of data. Have worked fine for me, no problems at all. I have no reservations in recommending their use when appropriate.
I’m talking about the storage service queues, not Service Bus. I did not use these in my architecture prototype but I have experimented with them; from what I’ve seen I think they will do a great job in the right place.
Worker Roles, Web Sites, Blobs, Queues : great, use them.
Table Storage: maybe, think carefully – is it right for your problem?