Introduction
I feel like I've over extended on leetcode, and severely under extended my time on system design. Hence, I'll be documenting all the most relevant system design concepts I've come across.
I feel like I've over extended on leetcode, and severely under extended my time on system design. Hence, I'll be documenting all the most relevant system design concepts I've come across.
Seems like such a simple question that we all should know, but what actually happens under the hood?
1. The browser takes the url (google.com) and sends it to the DNS server. The DNS server will do a look up and return the server's IP address.
2. The browser establishes a TCP connection with the server through a three-way handshake (SYN, SYN-ACK, ACK).
3. The browser sends an HTTP request to the server.
4. The server sends an HTTP response back to the browser (HTML, CSS, JavaScript).
5. The browser renders the page using the response!
There are two types of databases: relational and non-relational.
Relational databases (SQL databases) are used to store data in table and rows, which you can perform JOIN operations on via SQL. Common ones include MySQL, PostgreSQL, and SQLite.
Non-relational databases (NoSQL databases) will store data in a key-value pair, column storage, graph storage, or document storage. Common ones include MongoDB, Cassandra, and Redis.
When do I use which?
Non-relational databases are better when
Otherwise use a relational database.
There are two main types of scaling, vertical and horizontal.
Vertical scaling ("scape up") means adding more power (CPU, RAM) to your server.
Horizontal scaling ("scale out") means adding more servers to your server pool.
When do I use which?
Vertical scaling is better with low traffic, since it's simple. But the cons are that there are limits to how much power you can add. And if the server crashes, everything crashes (violates SPOF)
Horizontal scaling is better with high traffic, and large scale applications.
Consider your black-friday app running on 1 server. On black friday, you'll recieve so much traffic that it would reach the web server's load limit. User's will start to experience slower responsiveness and eventually, the server will crash.
A load balancer is used for this reason, to distribute traffic evenly across multiple servers.
The app will connect to a load balancer via the server's IP address. Then the load balance will connect to each server via a private IP address.
This allows the load balancer to identify and redirect traffic to the appropriate server.
In the same way having multiple servers prevents SPOF. We use database replication (multiple databases) to prevent data loss.
This is modelled via the master-slave relationship. The master database (only writes) and the slave database (only reads) replicate off the master database.
App's typically have more reads than writes, hence # of slaves > # of masters.
Having slave databases also allow for parallel reads, leading to better performance.
Cache is a temporary storage area to store the result of expensive responses or frequently accessed data in-memory.
When the browser requests a resource, the web server checks if the cache has the resource. If it does, return it instantly. If not, the server will query the database for the response, add it to the cache, and return it.
Something to note is databases and cache's aren't always in sync i.e if the database undergoes UPDATE SQL query. One strategy is setting a expiration policy (TTL) on the cache, and resetting it.
If you reset it to often, you'll need to query the database more often. If you reset it to rarely, you'll risk serving stale data.
Sometimes, a cache can become full. If it does, there are a few strategies to evict and make space. Some include LFU (Least Frequently Used), and FIFO (First In First Out).
CDN (Content Delivery Network) are geographcially disposed servers used to deliver static content (images, videos, CSS, JavaScript) to users.
There like a cache, but for static content.
Stateful servers remember client data from 1 request to the next.
Stateless servers keep no information about the client between requests.
The industry is moving towards stateless servers, as they are more scalable and easier to manage.
Message queues are durable components stored in memory that support async communication (think buffer for async requests).
The advantage comes in the decoupling, if the provider is unavailable, the consumer can still function and vice versa.
1KB -> 1MB -> 1GB -> 1TB -> 1PB
L1 cache: 0.5ns
L2 cache: 5ns
Mutex lock: 100ns
Main memory reference: 100ns
Disk seek: 10ms
Takeaways:
High availability means for the system to be continiously operational for a long period of time.
Service level agreement (SLA) is an agreement between service provider and customer, defining level of uptime your service will deliver.
Consider twitter. 300 million monthly users. 50% use Twitter daily. Users post 2 posts / day. 10% tweets contain media. Data stored for 5 years.
Query per second (QPS):
Average tweet size
Consider twitter. 300 million monthly users. 50% use Twitter daily. Users post 2 posts / day. 10% tweets contain media. Data stored for 5 years.
Query per second (QPS):
Average tweet size
Good questions to ask:
I.e if you are asked to design a news feed system
Next, draw key components on the whiteboard. Things like the client (web/mobile), APIs, web servers, data stores, cache, CDN, message queue, etc.
Do back-of-the-envelope calculations to evaluate if your blueprint fits the scale constraints. (but this might be a bonus)
Then go through the flow of your design
Maybe also include API endpoints or database schema.
In the news feed example, there are two main components, news feed and user postings.
For news feed, when a user publishes a post, the corresponding data is written into cache/database, the post will be populated to friends news feed.
The feed is built by aggregating friend's posts in reverse chronological order.