You know how sometimes you hear or read something and realize that whatever was said could be interpreted in multiple ways? I thought about that when contemplating the title of this post. Please join me as I travel along a few possible paths of meaning.
A literal interpretation of “Managing Analytic Workloads with Cloud” could pertain to the topic of administering a system deployed in the cloud, the purpose of which is to do analytics. The thrust of this path might discuss the benefits and drawbacks of an as-a-service environment when compared to a do-it-yourself deployment, for example.
As-a-service implies a managed environment in which the service provider takes care of the infrastructure while the customer focuses on using the capabilities that run on it. It’s like having a race car pit crew that comes with your vehicle to handle all the little (but necessary) details like tire tread depth, proper tire pressure, working instrument panel, clean windscreen, and topped-up gasoline to give you the best possible chance of winning the race.
Do-it-yourself, on the other hand, implies that the driver take care of everything his- or herself (or hire and train individuals to do some or all that work). It’s not necessarily better or worse, but depending on one’s skill, experience, and proficiency, the ability to go both fast and far could be limited if all you have is your own toolbox. But it can be done.
A second interpretation of “Managing Analytic Workloads with Cloud” is a bit more meta – as in deciding which workloads will go to the cloud and which ones will remain on-premises. The thrust of this path involves picking the right environment for the job regardless of who is going to be managing those capabilities as discussed above.
For example, the cloud is an ideal environment for workloads that are not persistent, such as test/dev or discovery analytics. With the ability to spin up quickly, use as much as need, tear down or put to sleep when done, and pay only for what’s been used, the cloud is wonderful for such bursty workloads. These are the poster children of cloud.
The opposite is also true: persistent, semi-steady workloads in which you commit to persistent resources with a term subscription (to get a much lower price) can also be a good fit for the cloud. This is especially the case if there are “network effects” involved – as in myriad other services, data sources, applications, and similar benefits that naturally draw you and your user community to the cloud. Everyone likes a good watering hole near the office.
Centralization (at least from a logical perspective, if not a physical one) can make a lot of sense here – or not, depending on how much on-premises data, applications, and technical debt is at play. There is no one-size-fits-all, and the right choice is typically a function of business priorities, timeframe, and budget.
A third interpretation of “Managing Analytic Workloads with Cloud” pushes us deep in the opposite, non-meta direction – as in literally managing the work being done (i.e., queries) to deliver the best combination of performance (query response time) and cost (resources consumed). The thrust here is about granular control amongst contention, or the ability to prioritize in situations when there is not enough supply to handle all demand.
Some would have you believe that there is no need for workload management because the cloud has near-infinite capacity and resources can always be auto-scaled to handle increasing demand. It’s magic, right? Well, that would indeed be great if additional resources were free or infinitesimal in cost, but that’s simply not the case. So, unless your budget also has auto-scale with near-infinite capacity, there is little cause for celebration if “throwing resources at the problem” is your vendor’s approach.
On the other hand, a more elegant way to address resource contention is the ability to prioritize based on query type and user role. For example, one could imagine the need for a CFO to get priority access and faster query response times over ongoing production workloads, or tactical queries not being log-jammed by a particularly complex analytic query that might otherwise suck up all the resources in the system.
In these cases, powerful workload management – which, unlike hardware, does not incur additional cost – is the way to go. Frankly, these days software always trumps hardware, and especially sophisticated software runs circles around First In, First Out or the brute force of “use a bigger hammer” every day of the week.
So, there you have it, folks: my three interpretations of “Managing Analytic Workloads with Cloud”. Which interpretation first came to mind for you, and do you agree with me that all three interpretations are valid? I’d love to hear from you – so let me know!