Sunday, September 9, 2007

The eBay way to keep infrastructure architecture nimble

eBay has come a long way from the infrastructure architecture perspective from a system that didn't have any database to the latest Web 2.0 platform that supports millions of concurrent listings. An interview with eBay's V.P of systems and architecture, James Barrese, The eBay way describes this journey well. I liked the summary of the post:

"Innovating for a community of our size and maintaining the reliability that's expected is challenging, to say the least. Our business and IT leaders understand that to build a platform strategy, we must continue to create more infrastructure, and separate the infrastructure from our applications so we can remain nimble as a business. Despite the complexity, it's critical that IT is transparent to our internal business customers and that we don't burden our business units or our 233 million registered users with worries about availability, reliability, scalability, and security. That has to be woven into our day-to-day process. And it's what the millions of customers who make their living on eBay every day are counting on us to do."

eBay's strategy to focus on identifying the pain points early on and solving those problems first and keep the infrastructure nimble to adapt to growth has paid off. eBay focused on an automated process to roll out the weekly builds into their production system and tracking down the code change that could have destabilized a certain set of features. The most difficult aspect of sustaining engineering is to isolate the change that is causing an error; fixing the error once the root cause is known is relatively easy most of the times. eBay also embraces the fact that if you want to roll out changes quickly, the limited QA efforts, automated or otherwise, are not going to guarantee that there won't be any errors. Anticipating errors and have a quick plan to fix it is a smart strategy.

If you read the post closely you will observe that all the efforts seem to be related to the infrastructure architecture such as high availability, change management, security, third-party API, concurrency etc. ebay did not get distracted by the Web 2.0 bandwagon early on and instead focused on platform strategy to support their core business. This is a lesson that many organizations could probably learn that be nimble and do what your business needs and don't get distracted by disruptive changes, instead embrace them slowly. Users will forgive you if your web site doesn't have round corners and does not do AJAX, but they won't forgive you if they could not drum up their bid and lost the auction because the web site was slow or was not available.

One of the challenges eBay faced was lack of any good industry practices for similar kind of requirements since eBay was unique in a way it grew exponentially and had to keep changing their infrastructure based on what they think is the right way to it. eBay is still working on grid infrastructure that could standardize some of their infrastructure and service delivery platform architecture. This would certainly alleviate some of the pains that they have from their proprietary infrastructure and could potentially become the de facto best practices for the entire industry to achieve the best on-demand user experience.

eBay kept it simple - a small list of trusted suppliers, infrastructure that can grow with users, and a good set of third party API and services to complete the ecosystem to empower users to get the maximum juice out of their platform. That's the eBay way!

No comments: