0% found this document useful (0 votes)
139 views46 pages

eBay Scalability and Architecture

The document discusses eBay's architecture and strategies for maintaining scalability and agility. It describes eBay's large scale, including billions of daily interactions. It also outlines eBay's transition to more automated, cloud-based infrastructure and a next generation service-oriented platform. This is intended to improve development productivity while allowing faster innovation and release times.

Uploaded by

gamezzzz
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
139 views46 pages

eBay Scalability and Architecture

The document discusses eBay's architecture and strategies for maintaining scalability and agility. It describes eBay's large scale, including billions of daily interactions. It also outlines eBay's transition to more automated, cloud-based infrastructure and a next generation service-oriented platform. This is intended to improve development productivity while allowing faster innovation and release times.

Uploaded by

gamezzzz
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 46

eBay

 Architecture  
Scalability  with  Agility  

Tony  Ng  
Director,  Systems  Architecture  
October  2011  
About  Me  

•  eBay  –  Systems  Architecture  and  Engineering  


•  Yahoo!  –  Social,  Developer  PlaEorms,  YQL  
•  Sun  Microsystems  –  J2EE,  GlassFish,  JSRs  
•  Author  of  books  on  J2EE,  SOA  
 
 

2  
eBay  Stats  

•  97  million  acQve  users  


•  62B  Gross  Merchandise  Volume  in  2010  
•  200  million  items  for  sale  in  50,000  categories  
•  A  cell  phone  is  sold  every  5  seconds  in  US  
•  An  iPad  sold  every  2.2  minutes  in  US  
•  A  pair  of  shoes  sold  every  9  seconds  in  US  
•  A  passenger  vehicle  sold  every  2  minutes  
•  A  motorcycle  sold  every  6  minutes  
http://www.ebayinc.com/factsheets
3  
eBay  Scale  
•  9  Petabytes  of  data  storage  
•  10,000  applicaQon  servers  
•  44  million  lines  of  code  
•  2  billion  pictures  
•  A  typical  day  
–  75B  database  calls  
–  4B  page  views  
–  250B  search  queries  
–  Billions  of  service  calls  
–  Hundreds  of  millions  of  internal  asynchronous  events  

4  
History  of  Technology  
·∙ Java
·∙ V4 Components
·∙ Services
·∙ Internal Cloud
·∙ Platform

·∙ Java
·∙ XSL
·∙ Layered
·∙ Horizontal Scale
Architecture Maturity

Innovation Potential

·∙ Some APIs

·∙ Perl/C++
Agility / TTM

·∙ Inline HTML
·∙ Monolithic
·∙ Vertical Scale
·∙ Walled Garden

2009+
2001

2005
1995

1999
eBay  Scalable  Architecture  

•  ParQQon  everything  
–  Databases,  applicaQon  Qer,  search  engine  

•  Stateless  preference  
–  No  session  state  in  app  Qer  

•  Asynchronous  processing  
–  Event  streams,  batch  

•  Manage  failures  
–  Central  applicaQon  logging  
–  Mark  downs  

6  
Next  Challenges  

•  To  stay  compeQQve,  we  need  to  deliver  quality  


features  and  innovaQons  at  acceleraQng  paces  
•  Complexity  as  our  codebase  grows  
•  Improve  developer  producQvity  
•  Enable  faster  Qme-­‐to-­‐market  while  maintaining  site  
stability  

7  
Scalability  with  Agility  

•  Strategy  1:  AutomaQon  with  Cloud  


•  Strategy  2:  Next  Gen  Service  OrientaQon  
•  Strategy  3:  Modularity  

8  
Automa=on  with  Cloud  

9  
Hardware  Acquisi=on  

request order receive & deliver


{ servers, rack & wire
model, app } Label (app)

“several”
weeks
2-3 w 1w
repurpose

request order Receive deliver to request deliver


{servers, pre-racked cache {servers,
model } Pre-wired model, app }
quarterly

minutes
2-3 w 1 day

repurpose
10  
Improving  U=liza=on  

DR

Number of servers required based on utilization for 8 pools


11  
Infrastructure  Virtualiza=on  

Application App App App


Application App App App

Spare spare spare spare


Global resource pool
Infra Infra Infra Infra
Shared infrastructure
eBay  Cloud  

Self Service Automation Capacity Management


Portal

Resource Allocation

Virtualization

Spare Capacity

pool Hardware Acquisition


provisioning
in minutes

Improved Time to Market


Architecture  Decision  

14  
Infrastructure  &  PlaIorm  as  a  service  

Higher developer
productivity

Full application level


automation

Enables innovation
on new platforms
Platform As A Service
Infrastructure Automated Life Cycle Management
level automation
Front End, Search Back End, Generic Platform

Infrastructure As A Service

Automated Operations

Virtualized & Common Infrastructure

15  
Model  Driven  Deployment  Automa=on  

LB Pool LB Pool •  Desired configuration is


specified in the expected state
and persisted in CMS
Server Server Server Server Server Server •  Upon approval, the
orchestration will configure the
site to reflect the desired
configuration.
Reconciliation
Expected Current
State State •  Updated site configuration is
discovered based on
Comparison detection of configuration
events
•  Reconciliation between the
Orchestration Discovery expected and current state
allows to verify the proper
configuration.
•  On going validation allows the
detection of out of band
changes.
Site

16  
Open  Source  Integra=on  

IaaS/PaaS API IaaS/PaaS API


Distribute Distribute
orchestrat Resource orchestrat Resource
d d
ion Allocation ion Allocation
State State

Applicatio Access Applicatio Access


AuthN/ AuthN/
n Point n Point
AuthZ AuthZ
Controller Controller Controller Controller

Compute Cluster Pool Compute Cluster Pool


Controller Controller Controller Controller Controller Controller

Compute
Mgt.
DNS
Mgt.
LB
Mgt.
Monitor
ing
Open  Source  
Network Image/Pkg Software
SoluQon  
Prov Repo Dist. (openstack  /  Cloudstack)  
Applica=on  Architecture  

Before Ongoing Future


“Cloud ‘Cloud
Friendly” ready’
Next  Gen  Service  Orienta=on  

19  
Services  @  eBay  
•  It’s a journey !
•  History
•  One of the first to expose APIs /Services
•  In early 2007, embarked on service orienting our entire
ecommerce platform, whether the functionality is internal
or external
•  Support REST style as well as SOA style
•  Have close to 300 services now and more on the way
•  Early adopter of SOA governance automation (Discovery
vs. control)

20
Architecture  Vision  
Customer Experience

Core  Experience   Custom  Experiences   Channels  

Application Platform Services

Login   Iden=ty   Catalog   Search   List   Pricing   Offer   ADs  Messages   Cart   Coupons  Payment  Shipping  CS  

Technology Platform

App   Data  Access   Dev  Tools   Presenta Messaging   SOA   Cloud  


Stack   Layer   =on  

Operations Infrastructure Layer

Power   Data  Center   Hardware   Network   Database   Tools   Opera=ons  


Challenges  

•  MulQple  data  formats  


•  Latency  
•  Service  consumer  producQvity  

22  
Challenge  1:  Mul=ple  Data  Formats  
•  Mix  of  user  preferences  
–  SOAP  
–  XML  /  HTTP  
–  JSON  
–  Name-­‐Value  Pair  (NV)  
•  Service  developers  don’t  want  to  write  extra  code  to  do  
conversions;  too  much  maintenance  impact  
•  Key  observaQons:  
–  Users  ask  for  whatever  data  format  they  want.  
–  Anything  you  can  express  in  XML,  you  can  express  in  other  
formats  
–  Complete  mapping  from  XML  structures  to  NV  and  JSON  

23  
Solu=on:  Pluggable  Data  Formats  Using  JAXB  

Uniform interface
XML
Pluggable formats

A single
Instance of
JSON XML Directly Service Impl

Ser/Deser module
NV deserialized
into
pipeline
JAXB Passed to
JSON Java
objects
others
NV

SOA framework

Other
formats
No intermediate format,
24  
Avoids extra conversion
Challenge  2:  Latency  

•  For  large  datasets,  there  can  be  nasty  latencies.  


– Not  fixed  by  compressing  or  using  Fast  Infoset  

2MB structured response payload


350
300
250
200
150
100
50
0 Wire Time (msec)

25  
Solu=on:  Binary  Formats  
•  Evaluated  binary  formats:  
•  Google  Protocol  Buffers,  Avro,  ThriY  

•  Numbers  look  promising  (serializaQon,  deserializaQon)  


•  New  challenges  with  these:  
•  Each  has  its  own  schema  (type  definiQon  language)  to  
model  types  and  messages  
•  Each  has  its  own  code  genera=on  for  language  bindings  
•  NOT  directly  compaQble  with  JAXB  beans  
•  eBay  SOA  plaEorm  uses  WSDL/XML  Schema  (XSD)  data  
modeling,  and  JAXB  language  bindings  

26  
Compare  Popular  Binary  Formats  
Protobuf Avro Thrift
•  Own IDL/schema •  JSON based Schema •  Own IDL/schema
•  Sequence numbers for each •  Schema prepended to the message •  Sequence numbers for each
element on the wire element
•  Compact binary representation on •  Compact binary representation on •  Compact binary representation on
the wire the wire the wire
•  Most XML schema elements are •  Most XML schema elements are •  Most XML schema elements are
mappable to equivalents, except mappable to equivalent, except mappable to equivalents, except
polymorphic constructs polymorphic constructs polymorphic constructs
•  Versioning is similar to XML, a bit •  Versioning is easier •  Versioning is similar to XML, a bit
more complex in implementing due more complex in implementing due
to sequence numbers to sequence numbers

Inheritance
Self-­‐ /
Complex   Unions   References   Polymorph
Types   (Choice  Type)  (Trees)   Enums   ism   Inline  A`achment  

Protobuf   Yes   No   Yes   Yes   No   No  


Yes  (with  
Avro   Yes   Yes   workaround)   Yes   No   No  

ThriY   Yes   No   No   No   No   No  

XML  
27   Yes   Yes   Yes   Yes   Yes   Yes  (MIME-­‐TYPE)  
Comparison  of  Data  Formats  
Response data: 50 items x 75 fields (about 8000 objects)
200
180
160
140
120
100 Size (KB)
80 Wire time (msec)

60
40
20
0
JSON XML Fast Infoset Protobuf
28  
Latency  Improvements  
200
180
160
140
120
100
80 Wire Time(msec)
60
40
20
0
XML XML no XML flat PB no PB flat
poly poly

29  
Challenge  3:    Service  Consumer  Produc=vity  

•  Large,  complex  requests  and  responses  


•  Get  exactly  what  they  want  in  data  returned  from  services  
•  Lack  of  consistency  in  service  interface  convenQons  and  data  
access  pajerns  
•  Real  client  applicaQons  make  calls  to  mulQple  services  at  a  
Qme  
–  Serial  calls  increase  latency.  Managing  parallel  calls  is  complex  
•  Impedance  mismatch  between  service  interface  and  client  
needs  
–  Too  much  data  is  returned  
–  1  +  n  calls  to  get  detailed  data  

30  
Sneak  Preview:    

•  New  technology  from  eBay  


•  Plan  to  open  source  soon  
•  SQL  +  JSON  based  scripQng  language  for  aggregaQon  
and  orchestraQon  of  service  calls  
•  Filtering  and  projecQons  of  responses  
•  Async  orchestraQon  engine  
–  AutomaQc  parallelizaQon,  fork  /  join  

31  
What  ql.io  Enables  
•  Create  consumer-­‐controlled  interfaces    
-  fix/patch  APIs  on  the  fly  
•  Filter  and  project  responses    
-  use  a  declaraQve  language  
•  Bring  in  consistency    
-  offer  RESTful  shims  with  simpler  syntax  
•  Aggregate  mul=ple  APIs  
-  such  as  batching  
•  Orchestrate  requests  
-  without  worrying  about  async  forks  and  joins  

32  
ql.io  Examples  

•  Simple  Select  
–  select  *  from  ebay.finding.items  where  keywords=‘ipad’  

•  Field  ProjecQons  
–  select  Qtle,  itemId  from  ebay.finding.items  where  
keywords=‘ipad’  

•  Sub-­‐Select  
–  select  e.Title,  e.ItemID  from  ebay.item.details  as  e  where  
e.itemId  in  (select  itemId  from  ebay.finding.items  where  
keywords  =  ‘ipad’)  

33  
ql.io  Batch  Example  
itemId  =  select  itemId  from  ebay.finding.items  where  keywords  =  'ferrari'  limit  1;  
item  =  select  *  from  ebay.shopping.singleitem  where  itemId  =  '{itemId}';  
user  =  select  *  from  ebay.shopping.userprofile  where  userId  =  'sallamar';  
tradingItem  =  select  *  from  ebay.trading.geQtem  where  itemId  =  '{itemId}';  
bestOffers  =  select  *  from  ebay.trading.bestoffers  where  itemId  =  '{itemId}';  
bidders  =  select  *  from  ebay.trading.getallbidders  where  itemId  =  '{itemId}';  
return  {    
     "user"  :  "{user}",  
     "item"  :  "{item}”,  
     "tradingItem"  :  "{tradingItem}",  
     "bidders"  :  "{bidders}",  
     "bestOffers"  :  "{bestOffers}"  
};  
34  
ql.io  Demo  

35  
Modularity  

36  
Key  modularity  concepts  for  soYware  

•  Building  blocks  
•  Re-­‐use  
•  Granularity  
•  Dependencies  
•  EncapsulaQon  
•  ComposiQon  
•  Versioning   Source: http://techdistrict.kirkk.com/2010/04/22/granularity-architectures-nemesis/
Author: Kirk Knoernschild

37
Challenges  for  Large  Enterprises  

•  Some  stats  on  the  eBay  code  base  


–  ~  44  million  of  lines  of  code  and  growing  
–  Hundreds  of  thousands  of  classes  
–  Tens  of  thousands  of  packages  
–  ~  4,000+  jars  

•  We  have  too  many  dependencies  and  Qght  coupling  


in  our  code  
–  Everyone  sees  everyone  else  
–  Everyone  affects  everyone  else  
 
38
Challenges  for  Large  Enterprises  

•  Developer  producQvity/agility  suffers  as  the  knowledge  goes  down  


–  Changes  ripple  throughout  the  system  
–  Fallouts  from  changes/features  are  difficult  to  resolve  
–  Developers  slow  down  and  become  risk  averse  

knowledge complexity

code size
39  
Our  Goals  with  Modularity  Efforts  
•  Tame  complexity    
•  Organize  our  code  base  in  loose  coupling  fashion  
– Coarse-­‐grained  modules:  number  majers!  
– DeclaraQve  coupling  contract  
– Ability  to  hide  internals  
•  Establish  clear  code  ownership,  boundaries  and  
dependencies  
•  Allow  different  components  (and  teams)  evolve  at  
different  speeds  
•  Increase  development  agility  

40
Modularity  Solu=ons  Evalua=on  

•  Evaluated  OSGi,  Maven,  Jigsaw  and  JBoss  Module  


•  Criteria  include:  
–  Modularity  enforcement  
–  End-­‐to-­‐end  development  
–  MigraQon  concerns  
–  AdopQon  
–  Maturity  

•  Selected  OSGi  

41  
OSGi  @  eBay  
•  Modularize  plaEorm  into  OSGi  bundles  with  well-­‐defined  imports  and  
exports  
•  Challenges:  split  packages,  Classloader  contructs  
•  Source  to  binary  dependencies  
•  Refresh  end-­‐to-­‐end  development  life  cycle  
pull/push SCM pull

Command line
IDE
build (CI)

consume publish/consume

Deployment Server runtime


Repository
packaging deploy

42  
Lessons  Learned  

•  OSGi  learning  curve  is  sQll  fairly  steep  


–  large  group  of  developers  with  varying  skill  levels  
•  End-­‐to-­‐end  development  lifecycle  
–  Tools  may  not  work  well  together.  Leverage  OSGi  tools  like  bnd  
•  Conversion/migraQon  of  exisQng  code  base  
–  Not  starQng  from  vacuum  
–  Cost  to  rewrite  /  refactor  code  
–  We  cannot  afford  disrupQon  to  business  meanwhile:  “change  parts  
while  the  car  is  running”  
•  SemanQc  versioning  adopQon  is  important  

43

 
Overall  Summary  

•  Strategies  
–  Deployment  Agility:  AutomaQon  with  Cloud  
–  Development  Agility:  Next  gen  Service  OrientaQon  
–  Taming  complexity:  Modularity  

•  Systems  quality  &  scalable  architecture  as  key  


foundaQon  
•  Complexity  management  and  developer  producQvity  
becomes  increasingly  important  
•  Strike  balance  between  agility  and  stability  

44  
eBay Open Source

•  eBay has been a strong supporter of Open


Source model and community

•  Check out http://eBayOpenSource.org


•  Mission is to open source some of the best of breed
technologies that were developed originally within eBay
Inc.
•  Under a liberal open source license.
•  These projects are generic technology projects and
several years of development effort has gone into them
to mature them.

(Coming soon)

45

You might also like