Amazon Echo 會「聽」並且將資料送到第三方廣告平台

前陣子看到的研究報告,證實 Amazon Echo 會聽取資訊並且將這些資料送到第三方的廣告平台上 (會送到 Amazon 自家應該不算新聞):「Your Echos are Heard: Tracking, Profiling, and Ad Targeting in the Amazon Smart Speaker Ecosystem」。

先從 abstract 開始看,主要是目前這些 smart speaker 基本上沒有透明度,所以十位作者群們建立了一套評估用的 framework 用來測試各家 smart speaker 資訊蒐集已經影響的情況:

Smart speakers collect voice input that can be used to infer sensitive information about users. Given a number of egregious privacy breaches, there is a clear unmet need for greater transparency and control over data collection, sharing, and use by smart speaker platforms as well as third party skills supported on them. To bridge the gap, we build an auditing framework that leverages online advertising to measure data collection, its usage, and its sharing by the smart speaker platforms.

這次論文裡面提到的目標就是 Amazon Echo 會將聽到的東西分享給第三方的廣告平台,並且讓廣告平台可以調整競價 (賺更多的錢),而且這些資訊並沒有被揭露在 privacy policy 裡面:

We evaluate our framework on the Amazon smart speaker ecosystem. Our results show that Amazon and third parties (including advertising and tracking services) collect smart speaker interaction data. We find that Amazon processes voice data to infer user interests and uses it to serve targeted ads on-platform (Echo devices) as well as off-platform (web). Smart speaker interaction leads to as much as 30X higher ad bids from advertisers. Finally, we find that Amazon's and skills' operational practices are often not clearly disclosed in their privacy policies.

幾個比較重要的資訊,其中一個是「Network traffic distribution by persona, domain name, purpose, and organization」:

另外一個重點是哪些 3rd-party:

就心裡有個底,然後隔壁棚也有類似產品 (主業是做廣告的那家),大概要跑不掉...

FreeBSD 的 Amazon EC2 Image 打算自動使用本機空間當作 Swap

Twitter 上看到 Colin Percival 說計畫將 FreeBSD EC2 image (AMI) 自動偵測並使用 ephemeral disk 的空間當作 swap:

就算是使用 EBSgp2 或是 gp3,甚至是其他 VPS,我也很習慣開一點點的 swap 空間來用 (通常是用 file swap 的方式開 512MB,無論記憶體有多大),這算是我自己的 best practice 了,這可以把一些完全沒用到的 daemon 塞進 swap。

不過對於已經把 ephemeral disk 規劃拿來用的人可能會不太開心,需要去改設定...

Amazon EFS 的 file lock 限制

看到「Amazon EFS now supports a larger number of concurrent file locks」這篇提到:

This Amazon EFS update increases the number of simultaneous file locks an NFS mount can acquire to 65,536 (from 8,192 previously), enabling Amazon EFS to be used for a broader set of applications that heavily leverage file locking (including message broker and distributed analytics applications).

所以 NFS 的部份先前有 8192 個的上限在,現在則是拉 65536 個了。

不過提到 message broker,應該是以前的應用利用 NFS file lock 來做一些事情?

Amazon RDS for PostgreSQL 可以掛 155 台 Read Replica

看到 AWS 推出的新「功能」,可以讓 Amazon RDS for PostgreSQL 的 read replica 掛到 155 台:「Amazon RDS for PostgreSQL supports cascaded read replicas for up to 30X more read capacity」。

作法是透過三層架構,每台機器可以堆五台 replica:

Amazon Relational Database Service (Amazon RDS) for PostgreSQL announces support for PostgreSQL 14 with three levels of cascaded read replicas, 5 replicas per instance, supporting a maximum of up to 155 read replicas per source instance.

需要 PostgreSQL 14.1 或是之後的版本:

Starting with Amazon RDS for PostgreSQL 14.1 and higher, read intensive workloads such as data analytics can now benefit from up to 155 cascaded read replicas that offer up to 30 times higher read capacity versus previous versions of PostgreSQL, thereby reducing the load on source instance.

我記得 Amazon RDS for PostgreSQL 的 replica 是 EBS block-level replication,這種搞法還蠻有趣的 XDDD

Amazon SESv2 的 Deliverability Dashboard

其實是看到「Amazon SES V2 now supports email size of up to 40MB for inbound and outbound emails by default」這篇才注意到寄信的 Amazon SES 服務有了 SESv2,原文主要是講放寬信件的大小限制:

With this launch, the default message size limit in Amazon SES V2 increases from 10MB for email sending and 30MB for email receiving, to 40MB for both sending and receiving .

不過我跑去「Amazon SES pricing」看的時候意外翻到這個貴貴的東西:

The Deliverability Dashboard (via the SES API V2) is available for a fixed price of USD $1,250 per month. This charge includes reputation monitoring for up to five domains and 25 predictive email placement tests.

然後我試著去找 Deliverability Dashboard 是什麼,卻沒有專文介紹?(還是我找錯關鍵字...)

倒是在 2018 年的時候 Amazon Pinpoint 有個公告提到 Deliverability Dashboard,價錢也是 US$1,250/mo:「Amazon Pinpoint Announces a New Email Deliverability Dashboard to Help Customers Reach their Users' Inboxes」。

本來以為是 Amazon Pinpoint 的服務轉移掛到 SESv2 下,但看「Amazon Pinpoint Pricing」這邊,好像還是在啊...

雖然用不太到,但還是一頭霧水 XDDD

AWS Lambda 可以直接有 HTTPS Endpoint 了

AWS 宣佈 AWS Lambda 可以直接有一個 HTTPS Endpoint 了:「Announcing AWS Lambda Function URLs: Built-in HTTPS Endpoints for Single-Function Microservices」。

如同文章裡面提到的,先前得透過 API Gateway 或是 ALB 才能掛上 Lambda:

Each function is mapped to API endpoints, methods, and resources using services such as Amazon API Gateway and Application Load Balancer.

現在則是提供像 這樣的網域名稱給你用,而且看說明似乎是直接包含在本來的 Lambda 價錢內?就不用另外搞 API Gateway 或是 ALB 了:

Function URLs are included in Lambda’s request and duration pricing. For example, let’s imagine that you deploy a single Lambda function with 128 MB of memory and an average invocation time of 50 ms. The function receives five million requests every month, so the cost will be $1.00 for the requests, and $0.53 for the duration. The grand total is $1.53 per month, in the US East (N. Virginia) Region.

這讓我想到可以用 Lambda 當特製的 HTTP proxy 的專案,好像可以拿來整到 feedgen 裡面用?

PHP (以及 Laravel) 下使用 DynamoDB 的 ORM 工具

Twitter 上看到「Laravel DynamoDB Eloquent Models and Query Builder」這篇文章,裡面講「Laravel DynamoDB」這個套件,可以在 PHP (以及 Laravel) 下存取 DynamoDB

雖然套件提到了 Laravel,但文件裡面也有提到支援非 Laravel 的 PHP 環境下使用,單獨拿出來用也沒問題,比較重要的反倒是 DynamoDB 對各種 key 的概念。

如果是從零開始設計,但又不想要自己管資料庫,我會偏好先用 RDS 設計,無論是 MySQL 或是 PostgreSQL 的版本都行,畢竟 RDBMS 上面能做的事情比較多,對開發者比較友善,除非是第一天上線你就預期量會大到連 db.m5.24xlarge 都擋不住之類的情況...

AWS 將會把超過兩年的 EC2 AMI 自動設為 Deprecated

AWS 的公告,超過兩年的 EC2 AMIs (Amazon Machine Images) 將會被標為 deprecated:「Amazon EC2 now reduces visibility of public Amazon Machine Images (AMIs) older than two years」。

標成 deprecated 後主要的差異會是在 DescribeImages 這隻 API 上,除了 image 的擁有人外,其他人都不會顯示出來:

Once an AMI is deprecated, it will no longer appear in DescribeImages API calls for users that aren’t the owner of the AMI.

不過知道 AMI 的 id 還是可以直接開:

Users of a deprecated AMI can continue to launch instances and describe the deprecated AMI using its ID.

沒有特地說明原因,但應該是考慮到安全性,這年頭超過兩年不更新的系統大概都有一堆洞?不過馬上就想到 OpenBSD 好像未必...

Amazon RDS 的 Free Tier 方案包含了 db.t3.micro 與 db.t4g.micro

AWSdb.t3.microdb.t4g.micro 都放進 free tier 了:「Amazon RDS Free Tier now includes db.t3.micro, AWS Graviton2-based db.t4g.micro instances in all commercial regions」。

Customers new to AWS in the past 12 months and who were in regions where db.t2.micro was not available can now create free tier db.t3.micro or db.t4g.micro instances for the remainder of their first 12 months.

看說明是註冊的 12 個月內有這個方案可以用,可以拿來跑一些小東西...

昨天在 AWS User Group Taiwan 上分享的「High Availability Vault Service on AWS Environment」

昨天在「AWS User Group Taiwan Meetup 2022-03 線上 / 下小聚」這邊分享的主題,在講如何在 AWS 上弄出一個高可靠性的 Vault 服務。

投影片在 這邊可以抓到,我另外傳到 Speaker Deck 上面了:(好久沒用這個網站了?)

其實這類架構的設計有點像是 AWS 的 Solution Architect 在做的事情,如果一般的客戶開出類似的需求,應該也是會設計出類似的東西...

另外畢竟是在 AWS 的會議室裡面講,有些東西還是會避免提到,但裡面有很多概念是可以互換的,像是 Microsoft Azure 或是 GCP 上面都有可以抽換的服務,Vault 也都有支援。