Next, I needed help analyzing and deciphering the naming conventions and more technical aspects of the documentation. I’ve worked with APIs a bit, but it’s been 20 years since I wrote code and 6 years since I practiced SEO professionally.
包括作者的一些 Google 朋友,或是 ex-Googler 都確認這份文件符合 Google 內部的文件規範要求,另外裡面的元素編排也都很像是 Google 的文件。
本來以為事情大概就這樣,後續應該就是會有很多人從這份文件分析 Google 有哪些 SEO 的偏好,找出哪些東西與 Google 宣稱的不符。
"We would caution against making inaccurate assumptions about Search based on out-of-context, outdated, or incomplete information," Google spokesperson Davis Thompson told The Verge in an email. "We’ve shared extensive information about how Search works and the types of factors that our systems weigh, while also working to protect the integrity of our results from manipulation."
反正也沒有人會相信 outdated 了,但可以預想的是 Google 的搜尋結果應該又會變差,因為會有更多 SEO 垃圾開始想辦法衝排名上去...
To this day, my most public contribution to reddit is that I wrote the code to put the title of the post in the URL. That was done specifically for SEO purposes.
這個在 Google Webmasters (現在叫做 Google Search Console) 也針對 Reddit 處理,將速率強制設為 Custom,不讓 Reddit 的人改:
It was pretty much the only SEO optimization we ever did (along with a few DOM changes), because shortly after that, Google basically dedicated engineering effort specifically to crawling reddit. So much so that we lost the "crawl rate" button in our SEO admin page on Google, it was just set to "Custom".
I had to stand up a fleet of app servers and caches and databases, and change the load balancers so that Google basically had their own infrastructure (although we would shunt all crawlers there). Crawler traffic was very different than regular traffic -- it looked at pages more than two days old, something humans rarely did at the time. It would blow out every cache (memory, database, disk, etc.). So we put them on their own infra to keep them from killing the rest of the site.
Idempotence ([...]) is the property of certain operations in mathematics and computer science whereby they can be applied multiple times without changing the result beyond the initial application.
數學定義是這樣跑:
An element x of a magma (M, •) is said to be idempotent if:
x • x = x.
If all elements are idempotent with respect to •, then • is called idempotent. The formula ∀x, x • x = x is called the idempotency law for •.
A request method is considered "idempotent" if the intended effect on the server of multiple identical requests with that method is the same as the effect for a single such request. Of the request methods defined by this specification, PUT, DELETE, and safe request methods are idempotent.
Request methods are considered "safe" if their defined semantics are essentially read-only; i.e., the client does not request, and does not expect, any state change on the origin server as a result of applying a safe method to a target resource. Likewise, reasonable use of a safe method is not expected to cause any harm, loss of property, or unusual burden on the origin server.
然後標準的 HTTP method 是有定義的:
+---------+------+------------+---------------+
| Method | Safe | Idempotent | Reference |
+---------+------+------------+---------------+
| CONNECT | no | no | Section 4.3.6 |
| DELETE | no | yes | Section 4.3.5 |
| GET | yes | yes | Section 4.3.1 |
| HEAD | yes | yes | Section 4.3.2 |
| OPTIONS | yes | yes | Section 4.3.7 |
| POST | no | no | Section 4.3.3 |
| PUT | no | yes | Section 4.3.4 |
| TRACE | yes | yes | Section 4.3.8 |
+---------+------+------------+---------------+
“Log out” links that should be forms with a “log out” button—you can always style it to look like a link if you want.
“Unsubscribe” links in emails that immediately trigger the action of unsubscribing instead of going to a form where the POST method does the unsubscribing. I realise that this turns unsubscribing into a two-step process, which is a bit annoying from a usability point of view, but a destructive action should never be baked into a GET request.
這兩個動作都會造成 server 端的狀態改變,不應該用 GET,而我自己常常忘記第一個... 這邊其實可以用 form 產生 POST 需求,並且用 css 效果包起來,達到看起來跟一般的連結一樣。
Googlebot uses a web rendering service (WRS) that is based on Chrome 41 (M41). Generally, WRS supports the same web platform features and capabilities that the Chrome version it uses — for a full list refer to chromestatus.com, or use the compare function on caniuse.com.
裡面提到一些值得注意的事情,像是不支援 WebSocket,所以對於考慮 Google 搜尋結果的頁面來說,就要注意錯誤處理了...