自動校正字幕時間的軟體 subsync

看到用 Python 寫的「smacke/subsync」這個軟體,可以自動校正字幕時間:

Language-agnostic automatic synchronization of subtitles to video, so that subtitles are aligned to the correct starting point within the video.

他把上面這個變成下面這個:

因為不是用 machine learning 所以速度意外的快。演算法是對影片本身產生一個 array,然後對字幕也產生 array,最後對兩個 array 用個簡單的公式校準:

  • Discretize video and subtitles by time into 10ms windows.
  • For each 10ms window, determine whether that window contains speech. This is trivial to do for subtitles (we just determine whether any subtitle is "on" during each time window); for video, use an off-the-shelf voice activity detector (VAD) like the one built into webrtc.
  • Now we have two binary strings: one for the subtitles, and one for the video. Try to align these strings by matching 0's with 0's and 1's with 1's. We score these alignments as (# matching digits) - (# mismatched digits).

就這樣解決問題 XD

Google Hangouts 將會使用固定 IP 位置

Google 宣佈 Hangouts 將會使用固定 IP 位置,讓系統管理員更方便管理:「Dedicated Hangouts Meet IP addresses」。

We’re adding a range of official, fixed IP addresses to be used exclusively for classic Hangouts and Hangouts Meet in G Suite domains.

目前看到的資訊是:

IPv4: 74.125.250.0/24
IPv6: 2001:4860:4864:5::0/64

然後文章裡的說明是二月 14 日會變更:

Hangouts Meet and classic Hangouts will stop using the old IP address on February 14, 2019.

管理者可以調整關於這段 IP 的流量,通常是調高優先權?不過要調低也是可以...

Twitch 用 VP9 直播...

Twitch 整理了一篇「How VP9 delivers value for Twitch’s esports live streaming」,說明他們用 VP9 的經驗談。

裡面有很大的篇幅是在講 VP9 與 H.264 的比較,不過這兩個用的技術就已經不是同一個年代了,沒有進步的話就不用出來玩了...

裡面有講到一些有趣的東西,像是提到是用 FPGA 即時壓縮:

In this article, we will show that the FPGA-based real-time VP9 encoding can deliver at least 25% bitrate savings compared to the highest-quality H.264 encoders deployed in Twitch’s production today.

然後提到 1080p60 至少省了 25% 的頻寬 (這邊應該是相較於 H.264):

VP9’s Compression Efficiency for Live 1080p60 Encoding: We Can Achieve At Least 25% Bitrate Savings

查了一下,在桌機上的瀏覽器都差不多支援了:

VP9 is implemented in these web browsers:

Chromium and Google Chrome (usable by default since version 29 from May and August 2013, respectively),
Opera (since version 15 from July 2013),
Mozilla Firefox (since version 28 from March 2014),
Microsoft Edge (as of summer 2016).

行動裝置的話 Android 4.4+ 有支援,但在 iOS 上沒有支援...

整體看起來普及率算是不低,可以引入當主力 codec 降低頻寬成本,當設備不支援 VP9 時 (應該只有 iOS 透過 Safari 觀看的情況) 就用 H.264 stream 提供服務。

把 YouTube 對 channel 與 user 的自動播放功能關掉...

YouTube 在 channel 與 user 頁面會自動播放會讓人覺得頗困擾 (頁面一打開就有聲音),所以想要找看看有沒設定可以關掉... 找了之後發現很久前就有被問過,但是當時得到沒有這個功能的回答:「How do I DISABLE autoplay from other channels on my YouTube channel?」。

既然如此就只能找套件來解了... 目前是透過 Userscript 擋下自動播放,程式碼不長也蠻好懂的:「Disable YouTube Channel/User Home Page Video AutoPlay」。

這樣總算是不會被聲音搞到...

不吃電池的 HD Camera Streaming...

Hacker News Daily 上看到「Towards Battery-Free HD Video Streaming」這個,不使用電池僅靠反射產生訊號,可以達到 HD 畫質的 Camera Streaming (在原型機上測試可以跑出 720p/10fps):

Finally, we design a proof-of-concept prototype with off-the-shelf hardware components that successfully backscatter 720p HD video at 10 fps up to 16 feet.

而且畫質比想像中好很多,算是比「可用」的等級還高不少:

愈來愈多在研究用 backscatter 拼一些比較複雜的應用...

Twitter 放出來的 Vireo,一套 Open Source 授權的 Video Processing Library

Twitter 放出 Vireo,一套以 MIT License 釋出的 Video Processing Library:「Introducing Vireo: A Lightweight and Versatile Video Processing Library」。專案庫在 GitHubtwitter/vireo 可以取得。

C++ 寫的,另外也已經提供 Scala 的接口,這應該是讓 Twitter 的人可以方便使用:

Vireo is a lightweight and versatile video processing library that powers our video transcoding service, deep learning recognition systems and more. It is written in C++11 and built with functional programming principles. It also optionally comes with Scala wrappers that enable us to build scalable video processing applications within our backend services.

在 Tools 的部份也可以看到很多功能,像是:

thumbnails: extracts keyframes from the input video and saves them as JPG images

viddiff: checks if two video files are functionally identical or not (does not compare data that does not affect the playback behavior)

另外要注意的是,預設不會將 GPL 的套件納入編譯,需要指定 --enable-gpl 才會編進去:

The following libraries are disabled by default. To enable GPL licensed components, they have to be present in your system and --enable-gpl flag have to be explicitly passed to configure

看起來主要就是最常見的那包... (libavformat / libavcodec / libavutil / libswscale / libx264)

AWS re:Invent 2017 的影片

Twitter 上看到 Jeff Barr 引用了這份 Gist:「Links to YouTube recordings of AWS re:Invent 2017 sessions」。

由於今年開的規模又比去年大不少,影片相當多... 可以用關鍵字找來看。

Amazon Kinesis Streams 的 Video 版本:Amazon Kinesis Video Streams

這次 AWS 推出的 Amazon Kinesis Video Streams 在技術上看起來跟 Amazon Media Services 有不少重疊 (參考先前提到的文章「AWS Media Services 推出一卡車與影音相關的服務...」),但產品面上區隔開的服務:「Amazon Kinesis Video Streams – Serverless Video Ingestion and Storage for Vision-Enabled Apps」。

開頭介紹就有提到適合用在各種 IoT 裝置,用在一直有影像資料產生的設備上:

Cell phones, security cameras, baby monitors, drones, webcams, dashboard cameras, and even satellites can all generate high-intensity, high-quality video streams. Homes, offices, factories, cities, streets, and highways are now host to massive numbers of cameras.

像這張圖的所介紹的流程,以及可以保留天數的設計:

底層用了不少與 Amazon Media Services 相同的技術,但是包裝成不同的產品...

去電視廣告的服務又來了...

看到「Plex’s DVR now lets you skip the commercials… by removing them for you」這篇文章,介紹 Plex 要推出去電視廣告的服務...

維基百科上的介紹比較清楚:「Plex (software)」,主要有兩個元件組成,media server 與 player:

  • The Plex Media Server desktop application runs on Windows, macOS and Linux-compatibles including some types of NAS devices. The 'server' desktop application organizes video, audio and photos from your collections and from online services, enabling the players to access and stream the contents.
  • The media players. There are official clients available for mobile devices, smart TVs, and streaming boxes, a web app and Plex Home Theater (no longer maintained), as well as many third-party alternatives.

然後這次要推出的功能是直接在錄影的時候把廣告拿掉:

Plex confirmed it’s rolling out a new feature that will allow cord cutters to skip the commercials in the TV programs recorded using its software, making the company’s lower-cost solution to streaming live TV more compelling. Unlike other commercial-skip options, Plex’s option will remove commercials from recordings automatically.

這讓我有些印像... 當年 TiVo 也有類似的功能,不過文章裡有提到 TiVo 是提供 skip 而非直接拿掉:

The new feature works by locating the commercials in your recorded media. It then actually removes them before the media is stored in your library. That sounds like it could be even better than TiVo’s commercial skipping option, for example, because you don’t have to press a button to skip the ads — they’re being pulled out for you, proactively.

不過主要是認識了 Plex 這個軟體... 如果是電視兒童的話應該用的到 XD 台灣目前的電視節目好像看的比較少...

AWS Media Services 推出一卡車與影音相關的服務...

AWS 推出了一連串 AWS Elemental MediaOOXX 一連串影音相關的服務:「AWS Media Services – Process, Store, and Monetize Cloud-Based Video」。

但不是所有的服務都是相同的區域... 公告分別在:

不過這邊還是引用 Jeff Barr 文章裡的說明,可以看到從很源頭的 transencoding 到 DRM,以及 Live 格式,到後續的檔案儲存及後製 (像是上廣告) 都有:

AWS Elemental MediaConvert – File-based transcoding for OTT, broadcast, or archiving, with support for a long list of formats and codecs. Features include multi-channel audio, graphic overlays, closed captioning, and several DRM options.

AWS Elemental MediaLive – Live encoding to deliver video streams in real time to both televisions and multiscreen devices. Allows you to deploy highly reliable live channels in minutes, with full control over encoding parameters. It supports ad insertion, multi-channel audio, graphic overlays, and closed captioning.

AWS Elemental MediaPackage – Video origination and just-in-time packaging. Starting from a single input, produces output for multiple devices representing a long list of current and legacy formats. Supports multiple monetization models, time-shifted live streaming, ad insertion, DRM, and blackout management.

AWS Elemental MediaStore – Media-optimized storage that enables high performance and low latency applications such as live streaming, while taking advantage of the scale and durability of Amazon Simple Storage Service (S3).

AWS Elemental MediaTailor – Monetization service that supports ad serving and server-side ad insertion, a broad range of devices, transcoding, and accurate reporting of server-side and client-side ad insertion.

引個前同事的 tweet,先不說 Amazon SWF 的情況 (畢竟 Amazon SWF 還可以找到其他用途),倒是 Amazon Elastic Transcoder 很明顯要被淘汰掉了:

這種整個大包的東西是 AWS re:Invent 才有的能量,平常比較少看到...