原來 Waterfox 早就被廣告公司收購了...

看到「Waterfox G4.1.0 update reduces requirement to SSE 4.1, sets Startpage as the default search engine for Private Tabs」這篇才發現 Waterfox 在 2019 年年底的時候早就被廣告公司 System1 收購:「Waterfox web browser sold to System1」,過了兩個多月被 Ghacks 寫成新聞後才發表對應的公告:「Waterfox has joined System1」。

先前研究 Google Chrome 的替代品時有列到清單裡找機會測,看起來可以跳過了...

而這次的報導則是題到了 Waterfox 將 Private Tabs 的預設搜尋引擎改成 Startpage,關於 Startpage 之前在「Startpage 被廣告公司收購」這邊也提過了...

公平會對創業家兄弟與松果公司的 SEO 誘導轉向開罰

好像很少提到國內的新聞,但這則應該是這兩天蠻熱門的一個新聞,創業家兄弟與松果公司 (也是創業家兄弟公司) 被公平會開罰:「操作SEO搜尋關鍵字誤導消費者 創業家兄弟、松果公司挨罰」,相關的備份先留起來:Internet Archivearchive.today

公平會官方的新聞稿則可以在「利用程式設計引誘消費者「逛錯街」,公平會開罰」這邊看到,對應的網頁備份:Internet Archivearchive.today

用的是公平交易法第 25 條:

公平會於4月12日第1594次委員會議通過,創業家兄弟股份有限公司及松果購物股份有限公司利用「搜尋引擎優化 (Search Engine Optimization,簡稱SEO)」技術,並在搜尋引擎的顯示結果上不當顯示特定品牌名稱,使消費者誤認該賣場有販售特定品牌產品,藉以增進自身網站到訪率,違反公平交易法第25條規定,處創業家兄弟公司200萬元、松果公司80萬元罰鍰。

這條的條文可以從「公平交易法§25-全國法規資料庫」這邊看到:

除本法另有規定者外,事業亦不得為其他足以影響交易秩序之欺罔或顯失公平之行為。

主要的原因是點進去後卻沒有該項商品:

公平會發現,消費者在Google搜尋引擎打上特定品牌名稱,例如「悅夢床墊」時,搜尋結果會出現「悅夢床墊的熱銷搜尋結果│生活市集」、「人氣熱銷悅夢床墊口碑推薦品牌整理─松果購物」等搜尋結果,消費者被前述搜尋結果吸引點選進入「生活市集」、「松果購物」網站後,卻發現該賣場並無「悅夢床墊」之產品,此係生活市集及松果購物之經營者創業家兄弟公司及松果公司分別利用SEO技術所產生的現象。

而且會透過使用者在往站上搜尋的關鍵字產生對應的頁面:

公平會進一步調查後發現,創業家兄弟公司及松果公司對其所經營之「生活市集」及「松果購物」網頁進行設計,只要網路使用者在該2網站搜尋過「悅夢床墊」,縱然該2網站賣場並沒有賣「悅夢床墊」,其網站程式也會主動生成行銷文案網頁,以供搜尋引擎攫取。若有消費者之後在Google搜尋引擎查詢「悅夢床墊」時,搜尋結果便會帶出「悅夢床墊的熱銷搜尋結果│生活市集」、「人氣熱銷悅夢床墊口碑推薦品牌整理─松果購物」等搜尋結果項目,經消費者點選後即會導向「生活市集」、「松果購物」之網站。

然後判罰的部份:

公平會過往即曾就事業使用競爭對手事業名稱作為關鍵字廣告,並在關鍵字廣告併列競爭對手事業名稱之行為,認定違反公平交易法第25條規定。本案雖非創業家兄弟公司及松果公司直接使用「悅夢床墊」等他人商品品牌作為關鍵字廣告,但最終呈現之結果,本質上都是「誘導/轉向」(bait-and-switch)的欺罔行為,除了打斷消費者正常的商品搜尋與購買過程,也對其他販售該等品牌商品之經營者形成不公平競爭的效果。若任由發生而不予規範,未來將可能導致其他競爭者之競相仿效,消費者將更難以分辨搜尋結果呈現資訊之真偽,進而威脅電商市場之競爭秩序及消費者利益。故公平會認為違反公平交易法第25條「足以影響交易秩序之欺罔及顯失公平行為」,並分別處創業家兄弟公司200萬元、松果公司80萬元罰鍰。

所以這算是對 Dark pattern SEO 的部份開罰...

美國人使用社群媒體的情況

在「Social Media Usage by Age」這邊看到的文章,把美國人使用社群媒體的情況做成圖,資料來源是 Pew Research Center 的「Social Media Fact Sheet」這裡。

很明顯的可以看到 Google (Alphabet) 基本上就是 YouTube 一個產品吃天下,而 Facebook (Meta) 有三個產品在滲透,包括 Facebook、InstagramWhatsapp

LinkedIn 在出社會後會開始用,另外 Pinterest 這麼多老人家在用到是很驚奇 XDDD

跨雲端的 Zero Downtime 轉移

看到「Ask HN: Have you ever switched cloud?」這個討論,在講雲端之間的搬遷,其中 vidarh 的回答可以翻一下...

首先是他提到原因的部份,基本上都是因為錢的關係,從雲搬到另外一個雲,然後再搬到 Dedicated Hosting 上:

Yes. I once did zero downtime migration first from AWS to Google, then from Google to Hetzner for a client. Mostly for cost reasons: they had a lot of free credits, and moved to Hetzner when they ran out.

Their savings from using the credits were at least 20x what the migrations cost.

然後他也直接把整理的資料丟出來,首先是在兩端上都先建立 load balancer 類的服務:

* Set up haproxy, nginx or similar as reverse proxy and carefully decide if you can handle retries on failed queries. If you want true zero-downtime migration there's a challenge here in making sure you have a setup that lets you add and remove backends transparently. There are many ways of doing this of various complexity. I've tended to favour using dynamic dns updates for this; in this specific instance we used Hashicorp's Consul to keep dns updated w/services. I've also used ngx_mruby for instances where I needed more complex backend selection (allows writing Ruby code to execute within nginx)

再來是打通內網,其實就是 site-to-site VPN:

* Set up a VPN (or more depending on your networking setup) between the locations so that the reverse proxy can reach backends in both/all locations, and so that the backends can reach databases both places.

然後建立資料庫的 replication server 以及相關的機制:

* Replicate the database to the new location.

* Ensure your app has a mechanism for determining which database to use as the master. Just as for the reverse proxy we used Consul to select. All backends would switch on promoting a replica to master.

* Ensure you have a fast method to promote a database replica to a master. You don't want to be in a situation of having to fiddle with this. We had fully automated scripts to do the failover.

然後是確認 application 端可以切換自如:

* Ensure your app gracefully handles database failure of whatever it thinks the current master is. This is the trickiest bit in some cases, as you either need to make sure updates are idempotent, or you need to make sure updates during the switchover either reliably fail or reliably succeed. In the case I mentioned we were able to safely retry requests, but in many cases it'll be safer to just punt on true zero downtime migration assuming your setup can handle promotion of the new master fast enough (in our case the promotion of the new Postgres master took literally a couple of seconds, during which any failing updates would just translate to some page loads being slow as they retried, but if we hadn't been able to retry it'd have meant a few seconds downtime).

然後確認新的雲端有足夠的 capacity 撐住流量後,就是要轉移了,首先是降低 DNS TTL:

Once you have the new environment running and capable of handling requests (but using the database in the old environment):

* Reduce DNS record TTL.

然後把舊的 load balancer 指到新的後端,這時候如果發現問題可以快速 rollback 回來:

* Ensure the new backends are added to the reverse proxy. You should start seeing requests flow through the new backends and can verify error rates aren't increasing. This should be quick to undo if you see errors.

接著把 DNS 指到新的 load balancer,理論上應該不會有太大問題:

* Update DNS to add the new environment reverse proxy. You should start seeing requests hit the new reverse proxy, and some of it should flow through the new backends. Wait to see if any issues.

接著把資料庫切到新的機房,有問題時可以趕快切回去再確認哪邊有狀況:

* Promote the replica in the new location to master and verify everything still works. Ensure whatever replication you need from the new master works. You should now see all database requests hitting the new master.

最後的階段就是拔掉舊的架構:

* Drain connections from the old backends (remove them from the pool, but leave them running until they're not handling any requests). You should now have all traffic past the reverse proxy going via the new environment.

* Update DNS to remove the old environment reverse proxy. Wait for all traffic to stop hitting the old reverse proxy.

* When you're confident everything is fine, you can disable the old environment and bring DNS TTL back up.

其實這個方法跟雲端沒什麼關係,以前搞機房搬遷的時候應該都會規劃過類似的方案,大方向也都類似 (把 stateful services 與 stateless services 拆開來分析),只是不像雲端的彈性租賃,硬體要準備比較多...

我記得當年 Instagram 搬進 Facebook 機房的時候也有類似的計畫,之前有提過:「Instagram 從 AWS 搬到 Facebook 機房」。

台灣最近的話,好像是 PChome 24h 有把機房搬到 GCP 上面?看看他們之後會不會到 GCP 的場子上發表他們搬遷的過程...

Brendan Gregg 離開 Netflix

Brendan Gregg 宣佈離開 Netflix:「Netflix End of Series 1」,Hacker News 上他也有跳出來回答一些問題:「Netflix End of Series 1 (brendangregg.com)」。

看到有些問題還蠻有趣的,像是被問到桌子的大小:

Off topic: I’m a bit surprised about Gregg’s desk (pre-pandemic). I imagine he’s getting a top level salary at Netflix but yet he’s got a small desk in what it looks to me a shared small office (or perhaps is that a mini open space office? Can’t tell).

大概是在文章裡面有圖,所以被問:

他的回答:

A number of times people have asked about my desk over the years, and I'm curious as to why! I've visited other tech companies in the bay area, and the desks I see (including for 7-figure salary engineers) are the same as everyone else, in open office layouts. At Netflix it's been open office desks, and all engineers have the same desk.

Does some companies give bigger desks for certain staff, or offices, or is it a country thing (Europe?).

目前還沒有提到下一份工作是什麼:

I'll still be posting here in my next job. More on that soon...

社群維護的 YouTube Private API 套件

一樣是今天的 Hacker News Daily 上看到的東西,透過 YouTube 的 Private API 操作 YouTube 的套件:「Youtube.js – full-featured wrapper around YouTube's private API (github.com/luanrt)」。

這些 Private API 就是 YouTube 自己在網站上用的:

A full-featured wrapper around the Innertube API, which is what YouTube itself uses.

也因為這不是 Public API,也就不需要申請 key:

Do I need an API key to use this?

No, YouTube.js does not use any official API so no API keys are required.

當然可以預期他會無預警壞掉,所以可以自己衡量一下要怎麼搞...

比較有趣的是 Hacker News 的討論裡面反而有人在問要怎麼偵測這種 library 或是 bot XDDD

If you’re YouTube or any site, and want to stop these sort of wrappers - what’s the easiest way to do so without breaking your own site?

I find this task to be an interesting engineering problem.

A related question is if there’s an unspoofable way to detect a client.

不過掃了一下好像還好...

moreutils

今天的 Hacker News Daily 上面看到「Moreutils: A collection of Unix tools that nobody thought to write long ago (joeyh.name)」這則,講 moreutils 這套工具。

翻了一下之前在「當程式沒問題時就會吃掉輸出的 chronic」這邊有提過 chronic 了,原文的討論裡面也提到了其他工具的用法,像是 sponge 可以在 pipe stdin 都收完後才開檔案寫入,可以避免 shell 直接先把檔案幹掉的問題:

awk '{do_stuff()}' myfile.txt | sort -u | column --table > myfile.txt

在這個例子裡面因為 myfile.txt 先被 shell 清空幹掉了,awk 就讀不到東西,這時候可以用 sponge 接,等到 pipe stdin 都收完後才寫檔案:

awk '{do_stuff()}' myfile.txt | sort -u | column --table | sponge myfile.txt

另外是 vipe,可以在先將程式輸出的結果丟進 $EDITOR 裡面,然後再往後丟,像是:

git branch | vipe | xargs git branch -D

還有其他的工具可以用,我自己是把 moreutils 當標配在裝了...

把 Snap 包裝成 Flatpak 格式的工具

前幾天看到「unsnap」這個工具,可以把 Snap 套件轉成 Flatpak 套件,不過裡面有提到目前軟體的成熟度還沒有很高:

Let's say it's "Pre-alpha", as in "It kinda works on my computer".

但看起來會是個可以玩看看的東西,目前 Flatpak 的市場份額的確是愈吃愈多...

Ptt 正式關閉 Telnet 協定

在「[公告] Ptt 即日起關閉無加密的 telnet 連線方式」這邊看到正妹站長公佈關閉 Telnet 協定了,所以目前應該只剩下 WebSocketSSH 了,兩個都有走加密協定...

偶而還是會拿 Telnet 來用,像是測試 SMTP server 的時候,而 SMTP 也是少數現在大家頭痛的協定 XDDD

銀河的歷史又翻過了一頁...

這次 Jira 雲端版相關的服務炸鍋的情況 (還在進行中...)

Atlassian 最近好像把 Jira 雲端版相關的服務給炸了,本來想等到差不多告一段落再來看看發生什麼事情,直到看到這則推說預估還要兩個星期,看起來還是先寫下來好了,不然會忘記...:

在「Multiple sites showing down/under maintenance」這邊可以看到從清明節開始炸,到昨天的報告裡面可以看到受到影響的客戶裡面他們只恢復了 35%:

A small number of Atlassian customers continue to experience service outages and are unable to access their sites. Our global engineering teams are working 24/7 to make progress on this incident. At this time, we have rebuilt functionality for over 35% of the users who are impacted by the service outage, with no reported data loss. The rebuild stage is particularly complex due to several steps that are required to validate sites and verify data. These steps require extra time, but are critical to ensuring the integrity of rebuilt sites. We apologize for the length and severity of this incident and have taken steps to avoid a recurrence in the future.

Posted 19 hours ago. Apr 11, 2022 - 08:27 UTC

所以炸掉一個禮拜後大概恢復 1/3,所以的確官方預估還需要兩個禮拜應該差不多?另外在 Hacker News 上也有炸鍋的討論:「Atlassian products have been down for 4 days (atlassian.com)」。

另外在 The Register 上也有一系列的報導,裡面透漏的比官方的更多:「Atlassian Jira, Confluence outage persists two days on」、「Atlassian outage lingers, sparking data loss fears」、「Day 7 of the great Atlassian outage: IT giant still struggling to restore access」、「At last, Atlassian sees an end to its outage ... in two weeks」。

第一篇的副標題有提到原因:

'Routine maintenance script' blamed for derailed service for unlucky customers

第二篇則是提到大約 400 個客戶受到影響:

We were also told that the incident affects a relatively small number of Atlassian customers: about 400. That's only 0.18 per cent of the company's 226,000 customers, which isn't much consolation to the several hundred who still can't access their data.

之後再回頭來看所謂的 routine maintenance script 是什麼好了...