John,
Azure SQL would be a decent choice, or if that proves to be too expensive, regular SQL hosted on a VM. You can create an Azure Service Bus to hold the incoming requests, and then create competing consumers on 1 or more worker roles to monitor and process the messages. Each consumer can run the SQL and persist the data in a new table that is created and "pre-aggregated" for the caller, or you could persist the information to Azure BLOB storage in a structured format that matches your reporting tool (i.e. JSON). BLOB storage of the aggregated information will be the most cost effective, and relieve strain on SQL.
An alternative would be HDInsight which can aggregate the information in batch processing mode as well. I guess the choice between SQL/HDInsight depends on the native format of the base (non-aggregated) information.
manpreet
Best Answer
2 years ago
There is a lot of information out there about the various Azure data storage flavors however I'd like to ask for some advice for my particular scenario.
I'm putting together a pet project to become more familiar with Azure technology, in particular, Service Bus/Event Hubs and data storage platforms. The system I want to create is fairly simple: accept a moderate load of events (not IoT scale), persist them, and make aggregated data available such as 'User A had N events of type X in the past day/week/month/etc.' as reports.
Given that the data will be quite structured (e.g. users, user groups, events, etc.), and I will need aggregation capabilities, it suggests that relational storage may be the best fit, although more expensive.
Another alternative I've considered is to maintain aggregated data at near real-time using something like stream analytics but not sure if this is overkill compared to a more data warehouse-esque solution.
Any suggestions/help would be greatly appreciated.
John