Design for Scalability
Common Techniques
Server Farm (real time access)
- If there is a large number of independent (potentially concurrent) request,
then you can use a server farm which is basically a set of identically
configured machine, frontend by a load balancer.
- The application itself need to be stateless so the request can be dispatched
purely based on load conditions and not other factors.
- This strategy is even more effective when combining with Cloud computing as
adding more VM instances into the farm is just an API
call.
Data Partitioning
- Spread your data into multiple DB so that data access workload can be
distributed across multiple servers
- By nature, data is stateful. So there must be a deterministic mechanism to
dispatch data request to the server that host the data
- Data partitioning mechanism also need to take into considerations the data
access pattern. Data that need to be accessed together should be staying in the
same server. A more sophisticated approach can migrate data continuously
according to data access pattern shift.
Map / Reduce
(Batch Parallel Processing)
- The algorithm itself need to be parallelizable. This usually mean the steps
of execution should be relatively independent of each other.
- Google's Map/Reduce is a good
framework for this model. There is also an open source Java framework Hadoop as
well.
Content Delivery Network (Static Cache)
- This is common for static media content. The idea is to create many copies
of contents that are distributed geographically across servers.
- User request will be routed to the server replica with close
proxmity
Cache Engine (Dynamic Cache)
- This is typically implemented as a lookup cache.
- Memcached and EHCache are some of the popular
caching packages
Resources Pool
- DBSession and TCP connection are expensive to create, so reuse them across
multiple requests
Asynchronous Processing
- The service call in this example is better handled using an asynchronous
processing model. This is typically done in 2 ways: Callback and Polling
- In callback mode, the caller need to provide a response handler when making
the call. Some kind of co-ordination may be required between the calling thread
and the callback thread.
- In polling mode, the call itself will return a "future" handle immediately.
The caller can go off doing other things and later poll the "future" handle to
see if the response if ready. In this model, there is no extra thread being
created so no extra thread co-ordination is
needed.
Implementation design considerations
- Use efficient algorithms and data structure. Analyze the time (CPU) and
space (memory) complexity for logic that are execute frequently (ie: hot spots).
For example, carefully decide if hash table or binary tree should be use for
lookup.
- Analyze your concurrent access scenarios when multiple threads accessing
shared data. Carefully analyze the synchronization scenario and make sure the
locking is fine-grain enough. Also watch for any possibility of deadlock
situation and how you detect or prevent them. Also consider using Lock-Free
data structure (e.g. Java's Concurrent Package have a couple of them)
- Analyze the memory usage patterns in your logic. Determine where new objects
are created and where they are eligible for garbage collection. Be aware of the
creation of a lot of short-lived temporary objects as they will put a high load
on the Garbage Collector.
No comments:
Post a Comment