YouTube has a wealth of data but often you don’t feel like downloading it all to disk. How can we stream video to python and then access the frame data?
Start with a conda environment with Python with Jupyter Lab. At the time of writing Python 3.7 was a good base environment that was supported by most packages:
conda create --name youtube_frames python=3.7
conda activate youtube_frames
conda install jupyterlab
Install OpenCV (try both C++ and Python):
conda install -c conda-forge opencv
pip install opencv-python
Install YouTube tools:
pip install youtube-dl
pip install pafy
Edit the backend_youtube_dl.py
files in the conda environment directly (for me – ~/anaconda3/envs/youtube_frames/lib/python3.7/site-packages/pafy
) to disable the “dislike info”, which appears no longer supported by YouTube (comment out as below).
...
self._title = self._ydl_info['title']
self._author = self._ydl_info['uploader']
self._rating = self._ydl_info['average_rating']
self._length = self._ydl_info['duration']
self._viewcount = self._ydl_info['view_count']
self._likes = self._ydl_info['like_count']
---> # self._dislikes = self._ydl_info['dislike_count']
self._username = self._ydl_info['uploader_id']
self._category = self._ydl_info['categories'][0] if self._ydl_info['categories'] else ''
self._bestthumb = self._ydl_info['thumbnails'][0]['url']
self._bigthumb = g.urls['bigthumb'] % self.videoid
self._bigthumbhd = g.urls['bigthumbhd'] % self.videoid
self.expiry = time.time() + g.lifespan
...
Then open up a Jupyter Notebook and try out:
import cv2
import pafy
# Choose a YouTube URL (paste from viewing a video on the web)
url = "https://www.youtube.com/watch?v=BmrUJhY9teE"
# Use pafy to get the video stream url
video = pafy.new(url)
# Have a look at available streams
print("Streams : " + str(video.allstreams))
# But for now get best stream
best = video.getbest(preftype="mp4")
# Initialise OpenCV Video Capture Object with URL
capture = cv2.VideoCapture(best.url)
# Test out viewing very slowly frame by frame
while(capture.isOpened()):
# Capture frame-by-frame
ret, frame = capture.read()
if ret == True:
# Display the resulting frame
cv2.imshow('Frame',frame)
# Press Q on keyboard to exit
if cv2.waitKey(25) & 0xFF == ord('q'):
break
# Break the loop
else:
break
You can then edit the above loop to perform processing instead of showing the frame.

Playing Around with Streams
Not a lot of people realise that a YouTube video isn’t “one” thing – it’s legion. There are multiple available video feeds of different types and resolutions, and separate audio feeds. The pafy
package can help us access these.
As above to view the available streams run:
print("Streams : " + str(video.allstreams))
You can access stream details using the properties of the stream
object:
for i, stream in enumerate(video.videostreams):
print(f"Stream {i} - type: {stream.extension} - resolution: {stream.dimensions} - bitrate: {stream.bitrate}")
The mp4
videos couldn’t be decoded in my setup but the webm
video worked fine. You can select different streams by just selecting items in the list of streams and using the URL for that stream:
smallest = video.videostreams[1]
capture = cv2.VideoCapture(smallest.url)

Playing with Colour Spaces
Ever wondered what a video stream looked like in YUV space?
Yes? Then try running the code below:
# Read until video is completed
while(capture.isOpened()):
# Capture frame-by-frame
ret, frame = capture.read()
if ret == True:
# Convert to YUV
img_yuv = cv2.cvtColor(frame, cv2.COLOR_BGR2YUV)
# Split into YUV planes
y, u, v = cv2.split(img_yuv)
# Show each plane in a separate window
cv2.imshow('y', y)
cv2.imshow('u', u)
cv2.imshow('v', v)
# Press Q on keyboard to exit
if cv2.waitKey(25) & 0xFF == ord('q'):
break
# Break the loop
else:
break

Playing Audio Streams
This is a challenge for another day.
We have the audio stream URL from pafy
. We know PyAudio can be used to access audio data and dump this to numpy. What we don’t know is how to open a web audio stream and access the data using PyAudio.
An alternative is maybe to use requests
or urllib
to directly retrieve frames of audio data and then convert the bytes to numpy arrays (as per here).