Music streaming platforms like Spotify house mountains of data—track titles, artists, album info—that can power analytics, apps, or even personal projects. Extracting that data isn’t always simple, but Python makes it surprisingly accessible.
If you want to dive deep into Spotify playlists, grab detailed info, and automate the process, this guide is for you. We’ll cover everything: setting up the right Python tools, scraping with Selenium and BeautifulSoup, using Spotify’s API, and saving your data cleanly.
What You Need
Run these commands first:
pip install beautifulsoup4 selenium requests
Here’s the deal:
BeautifulSoup is your go-to for parsing static HTML content. Think: extracting a fixed list of tracks from a loaded page.
Selenium handles the tricky parts — dynamic content that loads on scroll or after clicking. It simulates a real browser’s behavior.
Requests is for straightforward API calls or simple HTTP requests.
Each has a unique strength. Use them wisely.
Prepare Selenium Using ChromeDriver
Selenium doesn’t work alone. It needs a web driver—like ChromeDriver—to operate the browser behind the scenes.
Download ChromeDriver from the official source.
Unzip and save it somewhere easy to find.
Test it quickly:
from selenium import webdriver
driver_path = "C:/webdriver/chromedriver.exe" # Adjust this path
driver = webdriver.Chrome(driver_path)
driver.get("https://google.com")
If Chrome opens and loads Google, you’re good to go.
Scrape Spotify Playlist Data
Spotify playlists load tracks dynamically. So just fetching the page HTML won’t cut it. You need to:
Launch the page with Selenium.
Scroll down to load every song.
Parse the fully loaded HTML with BeautifulSoup.
Extract track title, artist, and duration from HTML elements.
Here’s what the relevant HTML looks like:
<div class="tracklist-row">
<span class="track-name">Song Title</span>
<span class="artist-name">Artist</span>
<span class="track-duration">3:45</span>
</div>
Core Function to Scrape Spotify Playlist
Here’s a practical Python function that does the job:
from selenium import webdriver
from bs4 import BeautifulSoup
import time
def get_spotify_playlist_data(playlist_url):
options = webdriver.ChromeOptions()
options.add_argument("--headless") # Run without opening a window
driver = webdriver.Chrome(options=options)
driver.get(playlist_url)
time.sleep(5) # Allow page to load
# Scroll to bottom to load all tracks
driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
time.sleep(3)
html = driver.page_source
driver.quit()
soup = BeautifulSoup(html, "lxml")
tracks = []
for track in soup.find_all(class_="IjYxRc5luMiDPhKhZVUH UpiE7J6vPrJIa59qxts4"):
name = track.find(class_="e-9541-text encore-text-body-medium encore-internal-color-text-base btE2c3IKaOXZ4VNAb8WQ standalone-ellipsis-one-line").text
artist = track.find(class_="e-9541-text encore-text-body-small").find('a').text
duration = track.find(class_="e-9541-text encore-text-body-small encore-internal-color-text-subdued l5CmSxiQaap8rWOOpEpk").text
tracks.append({"track title": name, "artist": artist, "duration": duration})
return tracks
Put It to Work
Replace with your playlist URL:
playlist_url = "https://open.spotify.com/album/7aJuG4TFXa2hmE4z1yxc3n?si=W7c1b1nNR3C7akuySGq_7g"
data = get_spotify_playlist_data(playlist_url)
for track in data:
print(track)
Watch your console fill up with neatly scraped track info. Simple and effective.
Get Access to Spotify’s Official API
The API is the cleaner, legal, and more robust way to get data — if you have the right credentials.
Step 1: Register your app
Head to the Spotify Developer Dashboard. Create your app. Copy your Client ID and Client Secret.
Step 2: Get your access token
import requests
import base64
CLIENT_ID = "your_client_id"
CLIENT_SECRET = "your_client_secret"
credentials = f"{CLIENT_ID}:{CLIENT_SECRET}"
encoded_credentials = base64.b64encode(credentials.encode()).decode()
url = "https://accounts.spotify.com/api/token"
headers = {
"Authorization": f"Basic {encoded_credentials}",
"Content-Type": "application/x-www-form-urlencoded"
}
data = {"grant_type": "client_credentials"}
response = requests.post(url, headers=headers, data=data)
token = response.json().get("access_token")
print("Access Token:", token)
Step 3: Use the token
artist_id = "6qqNVTkY8uBg9cP3Jd7DAH"
url = f"https://api.spotify.com/v1/artists/{artist_id}"
headers = {"Authorization": f"Bearer {token}"}
response = requests.get(url, headers=headers)
artist_data = response.json()
print(artist_data)
Save Data for Later
Don’t lose your scraped gold. Save it as JSON:
import json
with open('tracks.json', 'w', encoding='utf-8') as f:
json.dump(data, f, ensure_ascii=False, indent=4)
print("Data saved to tracks.json")
Respect the Rules While Scraping
Prefer the official Spotify API for compliance and stability.
If scraping, check robots.txt to respect site rules.
Slow down your requests. Avoid overwhelming servers.
Proxy usage can help avoid blocks but use ethically.
Final Thoughts
Combining Python’s scraping power with Spotify’s API opens the door to rich music data. Whether you’re building analytics dashboards, apps, or just geeking out on playlists, these tools make your workflow smooth and scalable.