This article is oriented toward newbies. I want to teach you a bit about coding things like downloaders from video hosting providers. Of course, you could use ready-made libraries to solve that task, which is a perfect solution. However, if you prefer to learn how to do it by yourself let's go undercut and learn how to download Vimeo video using Python!
A bit of back-story
I made code for downloading videos from Vimeo in my last project using JavaScript, coz the whole project was based on NodeJS. But I think Python is a more readable language for learning something. So in some meaning, I will code this small tool with you together. In general, that doesn't matter which language to use.
By the way, I made a post about How to protect content against scrapers. Check it out if you are interested in scraping, Javascript, etc.
So, I could name this article on how to download Vimeo video using NodeJS or something else. In general, that doesn't matter.
I'll split the article into logic-based parts to make reading it more comfortable. Let's go forward.
Attention!
I made this post and this code for learning purposes. I don't recommend it for commercial use. I'm not responsible for any damage caused by this code. Please, don't lawlessly use this code and avoid illegal activities.
Setup empty project
This part is not related to the article's theme. In general, you need to just create main.py file and use locally installed Python. I plan to use version 3.10, but I guess my code will work fine in versions like 3.9, 3.7, etc.
Theory
When you watch video in the browser, everything goes automatically. The browser executes the page code and starts downloading the video. When there is enough data to show video you could press play, and the browser will start playing video.
We need to do the same. I mean find out where is video file URL and then download it.
Let's take some video for example. I took this one randomly. It looks strange, but we don't need to care about content.
Getting video ID
Our first goal is to take ID of the video. Those numbers at the end of video URL is video ID. So let's scrape them to separate variables.
# set target video url
target_video_url = 'https://vimeo.com/712159936'
# remove slash from the end of the url
if target_video_url[-1] == '/':
target_video_url = target_video_url[:-1]
# get video id from url
video_id = target_video_url.split('/')[-1]
# check the result
print(video_id)
Let me explain a bit. To get video ID we used the split function. This function makes an array from a string by splitting it by some char. To be sure there isn't a slash at the end of URL I added one more check. If we remove that check and use the split function, then the result will be different.
Getting project JSON config
What is the next step? I'll show you a bit. Open Vimeo website and then open the web inspector in your browser. I am using Safari so for me it's Option + CMD + I shortcut. Open Network tab and choose XHR/Fetch or similar. This tab shows all requests which the browser sends for this page. So now reload the page and get the results.
You should see a request named "config". This is what we are looking for.
The server responds with JSON data. And that JSON data contains real URL for video files. Do you know what that means? I want to say if we get a real video URL then we could easily download it. That's why we need to get the project config to download Vimeo video using Python.
If you open Headers tab (which might be different in your browser) for this request, you can see which request type (GET) and which URL we need.
Let's write code for downloading Vimeo video config. I would like to use requests library. You could install it with pip via command: pip install requests
.
# set video config url
video_config_url = 'https://player.vimeo.com/video/' + video_id + '/config'
# send get request to get video json config
video_config_response = requests.get(video_config_url)
# generate obj from json
video_config_json = video_config_response.json()
# check the result
print(video_config_json)
I run this code and got following result. Looks like everything is works OK.
Download Vimeo video using Python
We are interested in the field video_config_json['request']['files']['progressive']
. It contains some URL of different quality.
Let's ignore quality choices for now and just download the first available one.
# make variable for video config
video_config = video_config_json['request']['files']['progressive'][0]
# get video url
video_url = video_config['url']
# prepare file name for that video
video_name = video_id + '_' + video_config['quality'] + '.mp4'
# download video
video_response = requests.get(video_url)
# open file and write content there
video_file = open(video_name, 'wb')
video_file.write(video_response.content)
video_file.close()
# print result
print('downloaded: ' + video_name)
It works! You can check it yourself.
Choosing quality
Okay, but what if I need specific quality of the video? Let's make a quality selection. For example, let's say we need Vimeo video with a height near 480px. Usually, you can hear "video with 480p quality". Maybe you saw there are fields "width" and "height". So you can just compare videos quality and select the required one. However, what if there isn't required quality?
Let's make a code to select video with the closest height to our goal.
# target video height
target_video_height = 480
# video config
target_video_config = None
# check all video find the closest one
for video_config in video_config_json['request']['files']['progressive']:
# skip first video
if (target_video_config is None):
target_video_config = video_config
continue
# get video height
video_height = video_config['height']
# check video height
video_height_diff = abs(target_video_height - video_height)
target_video_height_diff = abs(target_video_height - target_video_config['height'])
# check video height diff
if video_height_diff < target_video_height_diff:
target_video_config = video_config
Maybe the code looks long, but it is pretty simple. We are going in the loop for each video config. If the target video config is None then we just set target_video_config
to the current video config. So the first iteration is always the same.
In the next loop iteration, we calculate the difference in pixels between the target video height and our results. The function abs is used to make numbers always positive.
If the current video diff is less than the target, it becomes the target.
After the loop ends we get the closest video height to our target.
Conclusions
After the whole code was ready, I refactored it. You can download the code from my GitHub.
As you can see the task is pretty easy, but if you still have questions let me know.
Also, please, let me know if you see any errors in my code or text. English is not my native language, so I could make many typos, mistakes, etc.
Don't forget to write a thankful comment if my post helps you to learn something ๐