Uber 對 Golang GC 的調整

Hacker News 上看到「How We Saved 70K Cores Across 30 Mission-Critical Services (Large-Scale, Semi-Automated Go GC Tuning @Uber)」這篇,講 Uber 的人怎麼調整 GolangGC,在 Hacker News 上的討論「Large-scale, semi-automated Go GC tuning (uber.com)」也有些東西再講。

一開始的方法是動態一直調整 GOGC 的值:

Our initial approach was to have a ticker to run every second to monitor the heap metrics, and then adjust GOGC value accordingly.

但這個方法的 overhead 太重:

The disadvantage of this approach is that the overhead starts to become considerable, because in order to read heap metrics Go needs to do a STW (ReadMemStats) and it is somewhat inaccurate, because we can have more than one garbage collection per second.

後來的方法是利用 SetFinalizer 來做 (然後這段 code 不知道為什麼是用圖片...):

Luckily we were able to find a good alternative. Go has finalizers (SetFinalizer), which are functions that run when the object is going to be garbage collected. They are mainly useful for cleaning memory in C code or some other resources. We were able to employ a self-referencing finalizer that resets itself on every GC invocation. This allows us to reduce any CPU overhead.

不過 Hacker News 上有些人也很驚訝於 30 個 service 用掉 70K cores 這件事情,以 Uber 的服務來說算是比預想多不少數字,而且這只是跑 Golang,而且這次省下來的部份...

另外在 Hacker News 上也有人提到 Golang 有在思考 soft memory limit 的設計,也值得看一看:「runtime/debug: soft memory limit #48409」、「Proposal: Soft memory limit」。

比較外送平台的 MealMe

前陣子看到「MealMe raises $900,000 for its food search engine」這個,可以互相比較外送平台的 MealMe,不過目前看起來是在美國。

可以拉出最便宜與預估最快的:

不知道台灣有沒有類似的服務,目前台灣有在做美食外送的應該是 FoodpandaUber Eats 最大,另外小的幾家是 Foodomo有無外送以及快點外送 (台中),不知道沒有有漏...

加州法院認為 Uber 與 Lyft 的司機是員工

先前在其他地區已經有很多判例了,這次會特別記錄下來是因為加州是 UberLyft 的總部:「Uber and Lyft ordered by California judge to classify drivers as employees」。

裡面有提到了去年九月加州政府通過了法案 (California Assembly Bill 5,簡稱 AB 5),把 ABC Test 放進法律,取代了之前的 Borello test,用來判斷聘顧關係 (是否為員工,或是獨立的合約關係):

Under the ABC test, a worker is considered an employee and not an independent contractor, unless the hiring entity satisfies all three of the following conditions:

  • The worker is free from the control and direction of the hiring entity in connection with the performance of the work, both under the contract for the performance of the work and in fact;
  • The worker performs work that is outside the usual course of the hiring entity’s business; and
  • The worker is customarily engaged in an independently established trade, occupation, or business of the same nature as that involved in the work performed.

現在需要這三點都成立才會認定為獨立的合約聘顧關係,雖然還有上訴的機會,但翻盤的機率應該不高,記得這個法案當初就是針對 Uber 跟 Lyft...

兩則跟 Uber 有關的消息,裁員與加州的新法...

Uber 從上市後的股價就不太好看,五月的時候以 $45 開盤,最近來到了 $33 左右,走到裁員這步不算太意外:「Uber lays off 435 people across engineering and product teams」。

以人數來算大約是 8%,有蠻大一部份是工程團段 (也不太意外):

Uber has laid off 435 employees across its product and engineering teams, the company announced today. Combined, the layoffs represent about 8% of the organization, with 170 people leaving the product team and 265 people leaving the engineering team.

另外一個相關的消息是加州通過法律,補上漏洞,對於這種以「合約關係」而認為不是員工的行為加以約束,認定這其實就是聘顧關係,所以相關的資方義務都必須被履行:「California Bill Makes App-Based Companies Treat Workers as Employees」。

法律上的官方文件可以參考「AB-5 Worker status: employees and independent contractors.」這邊,先用翻譯快速看了一下... 可以看出來勞方市場的行業被放進排除條款,因為這些領域勞方有比較強勢的談判籌碼,應該讓市場決定規則。而對於資方強勢的行業則是朝著保護勞工的條款而設計。

現在已經有感覺共享經濟的神話開始不斷的被戳破...

西班牙透過新法規限制 Uber 營業

包括 UberCabify 都受到新規範影響:「Ride-hailing companies suspend Barcelona services after new regulations」。

新規範限制乘客必須在上車前十五分鐘叫車:

The Catalan government ruled that ride-hailing services could only pick up passengers after a 15-minute delay from the time they were booked.

不是直接說你違法,而是用這個方式壓制隨叫隨到的服務... 這個方式應該會擴散到其他地區。

Uber 在倫敦將會被停業

Uber 在倫敦將會被停業:「Uber has license to operate in London revoked」、「London regulator announces Uber ban」、「Uber London loses licence to operate」。

更精確的說是不再續發 license,舊的 license 只到 9/30:

Transport for London (TfL), which operates public transport in the capital, has made the decision not to renew the app-based taxi’s license in the city.

The license was renewed in May, but for a period of only five months. It will run out on 30th September, though the company will be allowed to continue to operate during the appeal process.

看起來主要原因是圍繞於 Greyball (利用演算法躲避執法人員的工具):

According to the TfL regulatory board, the ‘approach and conduct’ of Uber showed a lack of corporate responsibility, which could have resulted in public safety and security issues. It also raised concerns with the company’s ‘approach to explaining the use of Greyball, software that could be used to block regulatory bodies from gaining full access to the app.’

新任 CEO 則是出來道歉:「Uber CEO apologizes for “mistakes” in London」。

其實是利益團體之間的衝突... 這戲還在繼續演。

Uber 戰火蔓延到 Unroll

最近 Uber 的 CEO 被 Tim Cook 叫去喝咖啡的事情被報導出來:「Uber’s C.E.O. Plays With Fire」,裡面提到了 Uber 試著要「辨別」使用者的 iPhone,而這違反蘋果的政策:

To halt the activity, Uber engineers assigned a persistent identity to iPhones with a small piece of code, a practice called “fingerprinting.” Uber could then identify an iPhone and prevent itself from being fooled even after the device was erased of its contents.

There was one problem: Fingerprinting iPhones broke Apple’s rules. Mr. Cook believed that wiping an iPhone should ensure that no trace of the owner’s identity remained on the device.

而 Uber 的搞法是針對蘋果總部所在地點屏蔽這個功能:

So Mr. Kalanick told his engineers to “geofence” Apple’s headquarters in Cupertino, Calif., a way to digitally identify people reviewing Uber’s software in a specific location. Uber would then obfuscate its code for people within that geofenced area, essentially drawing a digital lasso around those it wanted to keep in the dark. Apple employees at its headquarters were unable to see Uber’s fingerprinting.

然後被蘋果工程師抓到,於是 Tim Cook 把人叫來喝咖啡:

The ruse did not last. Apple engineers outside of Cupertino caught on to Uber’s methods, prompting Mr. Cook to call Mr. Kalanick to his office.

另外提到了 Uber 從 Unroll.me 買來 Lyft 的帳單資料當作分析:

Using an email digest service it owns named Unroll.me, Slice collected its customers’ emailed Lyft receipts from their inboxes and sold the anonymized data to Uber. Uber used the data as a proxy for the health of Lyft’s business. (Lyft, too, operates a competitive intelligence team.)

而更精彩的在 Hacker News 上的這串爆了不少料,提到 Unroll 會把所有信件掃下來,丟到 S3 上面:

I worked for a company that nearly acquired unroll.me. At the time, which was over three years ago, they had kept a copy of every single email of yours that you sent or received while a part of their service. Those emails were kept in a series of poorly secured S3 buckets. A large part of Slice buying unroll.me was for access to those email archives. Specifically, they wanted to look for keyword trends and for receipts from online purchases.

The founders of unroll.me were pretty dishonest, which is a large part of why the company I worked for declined to purchase the company. As an example, one of the problems was how the founders had valued and then diluted equity shares that employees held. To make a long story short, there weren't any circumstances in which employees who held options or an equity stake would see any money.

I hope you weren't emailed any legal documents or passwords written in the clear.

而在 FAQ 的「If I delete my Unroll.Me account, what will happen to all of my previously rolled up emails?」裡則是說我們沒有存你的信件:

這爆米花要多買一些了...

LibreTaxi:Uber 的替代方案

應該是因為 #DeleteUber 的關係而冒到排行榜上的「LibreTaxi」(關於 #DeleteUber 的事情,可以參考「What You Need to Know About #DeleteUber」這邊的說明)。

看了官網的說明覺得還蠻特別的,看了 GitHub 上的 ro31337/libretaxi 說明才對於他運作方式稍微有感覺:

系統跑在 Firebase 上,用 Telegram 當溝通工具,用這樣的架構來建構系統... 然後專案從 2016 年四月開始做的。

看到 zmx 貼了之前的連結,更確信 Uber 的問題不是技術問題了...

Twitter 上看到 zmx 提了一個連結,講 Uber 年初時貼的「How We Built Uber Engineering’s Highest Query per Second Service Using Go」這篇文章的問題:

對照最近的事情還蠻有趣的,尤其是這篇文章後面提到的,酸~爆~了~XDDD:

It is clear to me that the team at Uber under-engineered this problem. Thoughtfully designing this service could trim down the number of nodes by an order of magnitude and save hundreds of thousands of dollars each year. That may sound like pittance to a company valued at more than the GDP of Delaware, but in my eyes that’s the salaries of a few engineers and a few good engineers can go a long way. Maybe even further than the few extra Mercedes-Benz S-Classes they could add to their fleet from the money they could be saving...

先不提政治問題,上面提到的 Quadtree 算是簡單易懂的結構,好久沒看到這個資料結構了:

最近討論 Uber 的 MySQL 換 PostgreSQL 後又換回 MySQL 的文章...

先把兩份連結丟出來,一份是 PyPgDay 2013 時由 Uber 的 Evan Klitzke 給的「Migrating Uber from MySQL to PostgreSQL」,原 PDF 連結已經失效 (看起來已經被刪除),但這個網路年代什麼都可以找到備份... 可以在「Migrating Uber from MySQL to PostgreSQL」取得,但這個網站怪怪的,我另外丟了一份到 Google Docs 上

另外一份則是同一個人 Evan Klitzke 在 2016 年發表於公司的官方網站上:「Why Uber Engineering Switched from Postgres to MySQL」。

2013 年描述了從 MySQL 換到 PostgreSQL,2016 年同一個人出來則描述了從 PostgreSQL 換到 MySQL 的理由,有種臉腫腫的感覺。

先抓 2013 年的重點,當時分享的目標是要用 PostGIS

在 2016 年的文章絕口不提 PostGIS,而是提到各種效能問題:花了很長的篇幅講 Non-clustered Index 與 Clustered Index 的設計,以及 Replication 時的頻寬效能差異。

先不管 PostGIS,如果真的是 UPDATE 造成效能問題,那麼不是要朝 sharding 解決嗎,怎麼是換成 MySQL?換到 MySQL 後還是會遇到效能問題啊,你還是要在 application 層上面找出方案啊。

這篇文章看起來更像是內部技術與政治問題掛勾在一起談,因為政治原因而換 MySQL,然後找出技術原因說明換的理由 XDDD