Thursday, April 7, 2011

Cloud Economics

This was a "quick" answer to another post in my company blog (no, no public access).

My point of view is simple : When considering cloud as a new paradigm, elasticity is a KEY component. Without elasticity, you're just a cloud wannabe.

So speaking strictly of costs, I agree with the affirmation that "Cloud is going from CapEx to OpEx". Though, I would also like to stress the following:
Cloud should allow to decrease the overall of CapEx and OpEx together.
  • Decreasing CapEx because of dynamism
    You're optimizing your infrastructure resources and so buying less infra (or you're simply leasing it)
  • Decreasing OpEx because of dynamism & automation
    • Costs to add/remove instances, services, etc… are near 0
    • Or taking Hamilton's estimates[1], in a very large data center, 1 admin should be able to run >1000 servers alone i.s.o of ~140 in a medium data center.
    • It seems that for Google it is around 1 admin per 10.000 servers and they have a goal of 1 admin per 100.000 servers ! [2]
  • Decreasing OpEx because of an external service:
    Indeed, for customers using a public cloud you get rid of storage mgt, system upgrades, ...
There are mainly two factors to reduce cloud users costs:[3 (slides 7-10)]
  • Elasticity:
    • Here is a common strategy to buy hardware in a typical Static Data Center
      1. evaluate what will be the demand peak
      2. Buy HW supporting twice this demand
      3. End with a underutilized infrastructure (typically 10-20% resources usage on average)
    • Elasticity enables customers to pay only for what you are really consuming!
      • The "pay as you go" business model … because you don't have to overprovision your infrastructure.
      • The "pay as you grow" model, for the services growing suddenly, you're able to service the demand as it's growing. While if you had to acquire/install infrastructure, you should have a good capacity planning and having good estimates which is not an obvious task.
    • Good to note that Private Cloud Leaders like Google, Amazon & al started to sell their cloud because it enables to amortize their own cloud infrastructure by monetizing their idle time.
  • Economies of Scale
      Here are costs presented by Hamilton for current typical data centers Resources
                    • Cost in medium DC
                      Cost in Very Large DC
                      $95 / Mbps / month
                      $13 / Mbps / month
                      $2.20 / GB / month
                      $0.40 / GB / month
                      ~140 servers / admin
                      >1000 servers / admin
                        • It means that costs difference between medium and large data centers are between 5-7x ! Huge difference. In the case of Google, some speaks of 100x ! [1, 2]
                        • Also, I would like to remember is that it is not a projection. And as engineers, we should also think in terms of what we'll get tomorrow . IOW, the economies of scale will be yet an order of magnitude higher for exascale data centers.

                      By considering only these 2 points, it is easy to understand why there is a high probability customers would like to use the cloud to admin their services, even steady one. To name them, because of capacity planning and economies of scale.

                      Of course, when you want to adopt cloud you have to consider some others characteristics like network bandwidth & storage bottlenecks or some business ones like data confidentiality & auditability, security & vendor lock in.

                      So when thinking cloud solutions we should always think to the following
                      • Optimization, Automation, Dynamicity, Resources Synergy, Transparency.

                      "Pay as You Go" model is something which imposes that you'll bill your customers only what he is actually using. IOW, in the theory, if the CPU of one server is idle, he is not using this server and should not be billed.
                      "Pay as you Grow" model trends is to have instantaneous elasticity. If the one service grows quickly, the infrastructure grows to attend the peak. If one service demand shrinks, infrastructure adapts itself and shrinks too. Of course, customers costs will vary too growing and shrinking but the idea is to get the infrastructure following the demand.

                      Lowering Costs:
                      • Any solution to reduce power/cooling costs is welcome.
                      • Strong emphasis to automate Administration in order to shift people cost from top costs to nearly irrelevant. Remember: 1admin per 100.000 servers!
                        No good network automation to date : OpenFlow is one option!
                      Last but not least, we should see the "data center as a computer". This simple sentence has deep implications.
                      Customer should not even know there are several racks, several CPUs, several layers of caches, etc…
                      In an ideal world, a customer should be able to buy some resources from a Cloud Provider, install its application/service and running it without thinking to provisioning and capacity planning. That's the first step of Cloud Computing.

                      I would like to see more! Indeed, a customer should not worry about what is the cloud platform (vendor lock-in) he's running on. Standards APIs (similar to POSIX) should exist in order to have guaranteed interoperability between cloud such that if you want to migrate from one cloud to another, it's just a matter of moving your data. Only by then, we'll get to the Cloud OS.

                      [1] "Cloud Computing Economies of Scale", James Hamilton
                      [2] "Cloud Computing: Understanding Economies of Scale, Cloudscaling Blog
                      [3] "Above the Cloud, Dave Patterson

                      No comments:

                      Post a Comment