這次的 downtime 主要是發生在 Group Direct Message (GDM) 的部份:
A significant element of the datastore load appeared to be from a query that listed Group Direct Message (GDM) conversations by user. This operation is fronted by our cache tier, so the high query load seemed to indicate something was wrong with our caches.
這個 GDM 的查訊效率不高,而是靠 cache layer 撐住的,加上二月 22 日那天他們在更新 Consul 的 agent,導致 hit rate 的下降,以及遇到一個比較大的 peak time,接著就壓垮了資料庫。
Our underlying hardware (AWS) is nothing like this reliable. We see regular (several times a year) failure of racks of machines or whole DCs.
Across the whole fleet (all services), we lose 1-10 servers per day as a baseline. Major events are then on top of that and can impact thousand of hosts at once.
另外一個是反駁自以為的量級估算:
> Even the largest Slack instance probably has under 100,000 users and less than 1000 peak messages per second.
As I hear stories about teams using a microservices architecture, I've noticed a common pattern.
Almost all the successful microservice stories have started with a monolith that got too big and was broken up
Almost all the cases where I've heard of a system that was built as a microservice system from scratch, it has ended up in serious trouble.
This pattern has led many of my colleagues to argue that you shouldn't start a new project with microservices, even if you're sure your application will be big enough to make it worthwhile. .
You may be wondering about the drastic ASP.Net reduction in processing time compared to 2013 (which was 757 hours) despite 61 million more requests a day. That’s due to both a hardware upgrade in early 2015 as well as a lot of performance tuning inside the applications themselves.
We use websockets to push real-time updates to users such as notifications in the top bar, vote counts, new nav counts, new answers and comments, and a few other bits.
另外他們也發現有些瀏覽器連線已經連 18 個月了 (喂喂),也許應該去看一下人是不是還活著:
Fun fact: some of those browsers have been open for over 18 months. We’re not sure why. Someone should go check if those developers are still alive.
Years ago I pursued my interest in InnoDB’s architecture and design, and became impressed with its sophistication. Another way to say it is that InnoDB is complicated, as are all MVCC databases. However, InnoDB manages to hide the bulk of its complexity entirely from most users.
I decided to at least outline a book on InnoDB. After researching it for a while, it became clear that it would need to be a series of books in multiple volumes, with somewhere between 1000 and 2000 pages total.
作者還沒開始寫,不過 outline 已經先列出來了:
I did not begin writing. Although it is incomplete, outdated, and in some cases wrong, I share the outline here in case anyone is interested. It might be of particular interest to someone who thinks it’s an easy task to write a new database.