Migrating a Physical Data Center Ops to Cloud
Utilize Pay as you Go Cloud to replace Functionality and Data of a Physical Data Center
Expensive Data Center Ops to Cloud for Pennies
Expensive Data Center Ops to Cloud for Pennies
Cloud migration is the process of moving data, applications or other business elements to a cloud computing environment. There are various types of cloud migrations an enterprise can perform. One common model is the transfer of data and applications from a local, on-premises data center to the public cloud.
Cost efficiency
Ultimately the underlying driver for making the move from the legacy on-premise enterprise data warehouse to the cloud is cost-efficiency. Cost encompasses not just the direct cost of owning or licensing the data center equipment, but also the cost of third-party service, maintenance, and management, training and policy. These indirect costs can make up a significant proportion of the expenses of running on-premise data centers.
Let’s look closer at the cost of maintenance, management, and training.
Maintaining on-prem computing infrastructure requires space, construction, electricity, air conditioning, hardware procurement, operating systems, networking, physical security, and more. Instead, you can shift these costs to a cloud service provider to dramatically save money.
Managing an on-prem environment also takes deep expertise and experience. With the unstoppable trend towards the cloud, the cost of finding and training engineering talent will become harder and more expensive. From the recruiting of specialists, to project management of any new software installation, to monitoring and incident response. It’s a tremendous hassle that costs your organization much more than the actual computing resources. Conversely, it is the business of cloud service providers to run and manage cloud data centers in an aggressively growing market, saving you this headache.
A Proposed Migration Plan
A common misconception about cloud migration is that it will be a one time trip. But the reality is that the process of migrating data infrastructure to the cloud should happen gradually. A successful migration should feel as seamless as possible to the organization, so work isn't disrupted. The journey somewhat resembles the true story about the 59 stories Citigroup Center in New York, which underwent critical infrastructure reinforcements performed nightly over several months, so the public and building occupants wouldn’t notice. Likewise, when planning a major data migration the goal is to ensure everything is done step by step while minimizing downtime and disruption to users. This is why you should be prepared for a migration process which can take anywhere from several months to a year.
Step 1 - Migrate Existing Data
The first step is to create an initial copy of your existing data in a cloud data warehouse. This process will require choosing the right cloud data warehouse for your organization, and then making an initial copy of all your data.
There are two main challenges in this step. The first is to experiment and choose the right infrastructure for your organization. To do this, you might select a smaller data set and migrate it to several different data warehouses for comparison. The second challenge is to copy all of your existing data. This could be hundreds of terabytes of data, an amount that is not easily transferred over the internet to the cloud. Google and Amazon both provide various means of overcoming this challenge, by physically transporting your hard drives, trucks, and USPS.
Be aware that copying the raw data is just one part of this initial migration. You must verify the format and schema of the data you export from your warehouse, and then import your data’s schema into the cloud data warehouse before actual loading. Lastly, to continue onto step 2, you must mark the point in time of the exported snapshot, and use it when setting up an ongoing replication mechanism.
Best Practices in Cloud Migration
Best Practices in Cloud Migration
Performance
The downside of investing everything into owning your own on-premise environment is inflexibility. It is a long-term commitment, which hinders your organization from obtaining the latest and greatest technology. Financially the combined upfront CAPEX costs of buying the equipment and depreciation cost as it ages, add up very quickly too.
On the flip side, cloud service infrastructure affords you flexibility and the best equipment in the industry: the newest Intel processors, the most optimized RAM memory, petabit networking, SSD storage, arrays of GPUs, and even cloud programmable hardware. This comes with a wide ecosystem of software platforms: managed databases and warehouses, orchestration platforms, caching, queueing systems, and just about anything you need for any software architecture. The impact on your business is virtually immediate, especially if you operate in a competitive market where leaders move fast.
Security
Many who have on-premise data infrastructure are concerned about security. The reality is that it’s hard to be more secure and in control than housing your data center within your own perimeter. Ultimately security on-premise is only as reliable as the security policies put in place and enforced. This policy overhead is not insignificant and can’t be ignored.
Step 2 - Set up Ongoing Replication
After exporting the first snapshot of your on-prem data warehouse, and copying it to your cloud data warehouse, the next step would be to set up an ongoing synchronization process. Ongoing replication is more complicated than a single copy operation, as it is actually a series of incremental copy operations.
Each operation requires capturing changes to the data and its schema and applying those changes to the cloud data warehouse. Some changes, like deleted data or altered column types, may require tailored solutions in order to be applied to the cloud data warehouse. More technical challenges will be described later in this post.
Any synchronization solution should be benchmarked for latency and reliability, as these parameters are crucial to the success of the organization’s migration to the cloud. You may build this synchronization out yourself, or use a data pipelining service to handle the continuous replication of data and schemas. Once this foundation level is secured, you may go ab