Member-only story

I Analyzed TED Talks on AI Going Back to 2003. Here are the Revelations

Darren Gillis
15 min readAug 23, 2024

--

There are two main goals with this exercise. The first goal was to find a central source of curated content that was easily accessible and provided opinions, facts, and predictions related to AI, from early hypotheses on what is possible to where we are today and what today’s predictions might reveal going forward. The second goal was to not only provide a human commentary on the findings, but also to parse the contents into a data file and use present-day AI to help analyze and interpret the data.

The Dataset

I wanted to pull a collection of talks that went as far back as possible. I went directly to the source (ted.com) to locate the relevant talks rather than to YouTube or other sources. The videos are listed with topic tags, so I filtered for the “AI” tag. Here is the link with a sort parameter included to list the oldest talks first: https://www.ted.com/talks?sort=oldest&topics%5B0%5D=AI

I extracted the data using a Python script to scrape the content. The basic metadata for the videos was relatively easy to scrape and the next step was to scrape the transcript of the videos that were already available for the majority of the videos. Luckily each video had a link that was used to display the video along with the transcript, ex: Cynthia Breazeal: The rise of personal robots and so scraping the transcript was also relatively problem-free.

Unfortunately, not all videos included the transcripts. Still, I could locate another dozen or so of the referenced videos on YouTube and used an online service, https://tactiq.io/, to access and download YouTube’s automated transcript.

As I was going through the content, I determined I would discard any podcast recordings, interviews, and prerecorded material, including animations (TED-Ed) and “off-stage” commentary. The talks collected were strictly of the canonical TED talk type — an individual on a stage presenting spoken material with a typical timespan between 10 and 20 minutes.

The result was 209 talks given between 2003–2024, inclusively.

--

--

Darren Gillis
Darren Gillis

Written by Darren Gillis

I am Toronto-based software developer and technologist with 20+ years of experience across a broad range of technologies and industries.

No responses yet

Write a response