Home » Computer » Archive by category "Search Engine" (Page 3)

WiGLE (Wireless Geographic Logging Engine)

WiGLE 是個蒐集無線網路資訊的服務 (i.e. SSID、mac address 以及定位位置),依照維基百科上的資料,WiGLE 計畫從 2001 年開始,到現在快十五年了:

The first recorded hotspot on WiGLE was uploaded in September 2001.

趁著最近 Pokémon Go 會跑來跑去,就順便幫忙蒐集資料了...

目前我蒐集資料的方式是透過 WiGLE 的 Android 應用程式 (Wigle Wifi Wardriving) 開著讓他背景跑,然後回到家以後上傳,過一陣子讓系統更新後就可以看到了。

看到 zmx 貼了之前的連結,更確信 Uber 的問題不是技術問題了...

Twitter 上看到 zmx 提了一個連結,講 Uber 年初時貼的「How We Built Uber Engineering’s Highest Query per Second Service Using Go」這篇文章的問題:

對照最近的事情還蠻有趣的,尤其是這篇文章後面提到的,酸~爆~了~XDDD:

It is clear to me that the team at Uber under-engineered this problem. Thoughtfully designing this service could trim down the number of nodes by an order of magnitude and save hundreds of thousands of dollars each year. That may sound like pittance to a company valued at more than the GDP of Delaware, but in my eyes that’s the salaries of a few engineers and a few good engineers can go a long way. Maybe even further than the few extra Mercedes-Benz S-Classes they could add to their fleet from the money they could be saving...

先不提政治問題,上面提到的 Quadtree 算是簡單易懂的結構,好久沒看到這個資料結構了:

Google 的 www.google.com 要上 HSTS 了

Google 宣佈 www.google.com 要上 HSTS 了:「Bringing HSTS to www.google.com」。

雖然都已經使用 EFFHTTPS Everywhere 在跑確保 HTTPS,但這個進展還是很重要,可以讓一般使用者受到保護...

Google 的切換計畫是會逐步增加 HSTS 的 max-age,確保中間出問題時造成的衝擊。一開始只會設一天,然後會逐步增加,最後增加到一年:

In the immediate term, we’re focused on increasing the duration that the header is active (‘max-age’). We've initially set the header’s max-age to one day; the short duration helps mitigate the risk of any potential problems with this roll-out. By increasing the max-age, however, we reduce the likelihood that an initial request to www.google.com happens over HTTP. Over the next few months, we will ramp up the max-age of the header to at least one year.

Google 推出 iOS 上的搜尋輸入法 Gboard

Google 推出了 iOS 上的搜尋輸入法 Gboard:「Meet Gboard: Search, GIFs, emojis & more. Right from your keyboard.」,可以在 App Store 上下載「Gboard — Search. GIFs. Emojis & more. Right from your keyboard.」,目前只有英文版可以用,其他語言還要等:

Get it now in the App Store in English in the U.S., with more languages to come.

可以參考 Google 做的 GIF 動畫,把搜尋的功能結合進去了:

John Gruber 寫了一篇「Gboard」關於隱私問題的討論,看起來還頗正面的。除了搜尋結果當然會需要傳到 Google 的系統裡以外,只有 crash log 會匿名傳回去。

WordPress.com 的 Elasticsearch

在「State of WordPress.com Elasticsearch Systems 2016」這邊描述他們 Elasticsearch 的架構。

有五個 cluster 打散,有跑 1.3.x 也有 1.7.x。把一般使用者與 VIP 分開,而全站的資料又是一組。另外在 2.3.x 的測試機上跑 en.support.wordpress.com 的資料 (看起來是短時間炸掉沒關係?XD)。

由於是自己生機器出來,所以機器的選擇上用大量的記憶體與 SSD 硬碟來換各種效能:

Typical data server config:
* 96GB RAM with 31GB for ES heap. Remaining gets used for file system caching
* 1-3 TB of SSD per server. In our testing SSDs are very worthwhile.

另外上面還是有疊 cache:

memcache timeouts vary from 30 seconds to 36 hours depending on use case

Google PageRank 資料將不再公開

Google 將不再對外公開 PageRank 資訊:「Google has confirmed it is removing Toolbar PageRank」與「RIP Google PageRank score: A retrospective on how it ruined the web」。

PageRank 資訊是透過 Google Toolbar 再反向被挖出來的,而 Toolbar 上的資訊將會拿掉,也預期對應的 API 應該也會關閉:

Google has confirmed with Search Engine Land that it is removing Toolbar PageRank. That means that if you are using a tool or a browser that shows you PageRank data from Google, within the next couple weeks it will begin not to show any data at all.

Google 內部還是會用,只是不會公開了...

Archives