What is topping your list of considerations in your migration to the cloud
? If Geographic Architecture isn’t near the top, then it should be.
What is a Geographic Architecture you may ask? That is the geographic placement of the analytic ecosystem elements in your geographic space. The bigger question you should ask is why should you care?
Pull out a copy of your current Analytic Ecosystem Architecture and take a look. Chances are, like most companies, your architecture is centrally located and interconnected by large bandwidth, low latency Local Area Networks (LANs). Look at how many data connection lines interconnect the ecosystem elements. You probably have many data sources producing lots of data feeding the data stores. This would include your company’s transaction systems as well as other systems that capture the status of various business processes. You also have lots of data users connecting to the data stores as they analyze your business.
Now imagine moving the data store elements out of the central location and moving them to the cloud. What happens to your analytic ecosystem architecture? From a picture point of view, the many lines connecting to the data stores get stretched as you move the data stores to the remote data center. But what really happens is all those nice LAN connections get replaced by bandwidth limited, high latency Wide Area Networks (WANs).
This is where it really gets interesting. WANs are notorious for being unfriendly to moving large amounts of data across a geographic distance. If you have been around for a while like me, then you remember when the client-server applications of the 1990s were first tried over a WAN. It didn’t work out so well. That’s how companies like Citrix rose to fame as they solved the WAN data movement issue by just transporting screen and keyboard information.
Analytic ecosystems move lots of data. Unfortunately, the Citrix type of solution won’t be able to solve all the data movement issues over the WAN. The only real way to move lots of data over a WAN efficiently is to use well-tuned parallel data streams. If you can do that, then the WAN latency issue can be minimized.
Now look at the various ecosystem elements in your architecture. Do you know if they all support parallel data streams? Knowing that is going to be your number one task as you develop your Geographic Architecture. Chances are there are going to be some elements that do not support parallel data streams. You want to know that sooner rather than later. Sometimes the only solution to that dilemma is to collocate the ecosystem elements in the same geographic space.
I know of one customer that had to move their cloud system back on-premises because they didn’t realize they had an important dashboard app that pulled data from separate data stores that were now geographically dispersed. That dashboard app didn’t support parallel data streams and it no longer performed to expectations. To say the migration back was painful is an understatement.
At Teradata, we used to have a marketing tag line of “Born to be Parallel
.” That was true in the first database and it is still true today. And that parallelism extends to data movement both external to and within our Vantage
solution. So, if part of your analytic ecosystem includes Vantage (or the earlier versions of the Teradata database) then you are lucky and can skip analyzing those elements knowing they can support parallel data streams
. You will just need to concentrate on all the other ecosystem elements.
So, geography does matter in your Analytic Cloud Architecture. While there are many considerations on where to geographically place your analytic ecosystem elements including cost and preference, ultimately it will be how well the elements can transport data over the WAN that determine the placement.