Streaming music is a pretty polarizing topic for many people. I’m still in club Spotify for the moment, but I’ve become more interested lately in re-examining that choice.
There are obviously a lot of factors at play if I were to decide to switch - how artists are paid, what kinds of creators are promote on the streaming platform, the quality of the product itself… But the one that interests me the most right now is the enormous amount of data Spotify has about my music tastes. That data is what drives their recommendation algorithm, and it’s also a record of what I’ve been listening to that I’d probably miss if I switched platforms.
And most importantly, if I lost my listening history, it would make it hard for me to put together my music library again.
JamWise is all about controlling my own music tastes, including knowing what I like and dislike, why I feel that way, and applying that knowledge to new music to more efficiently find my favorites. In short, I’m making a manual set of tools that will do the same thing the Spotify algorithm has done for me in the past. So naturally, I was interested in what kind of data Spotify has about me. Of course, there’s probably no chance they are sharing all the data they have with me, but let’s see.
How to download your Spotify Data
I recently learned of the feature in Spotify that allows you to request a downloadable file of your user data. You can follow the steps below (for the web browser version of Spotify):
Go to Profile → Account → Privacy Settings
After clicking Privacy Settings, scroll down to the bottom of the page to find a section titled “Download Your Data.”
You have to request your data, and after a few days Spotify will send you an email with a download link. In the screenshot above, I’ve already requested and received my data.
Download the files to your computer. You can see some account information, along with some streaming and playlist data, plus some other files that aren’t so clear yet.
The data files are in .json format, which is just a data storage format commonly used in web applications. It looks a little messy, but you can see some details if you poke around - I used Firefox to open the file here, but later we’ll use some other methods to get the data out. Below is an example from my “Streaming History” data file.
Understanding Your Spotify Data
To understand my Spotify data, I had to bring it into a format I can understand - I’m used to Excel, so I used that. I followed the steps in this link to import the JSON files to Excel. I’ll use Excel for the files where the data needs to be organized better.
Spotify is delivering my extended listening history separately, so that will be a HUGE amount of data that I might need to process in a different way. Excel is clunky with huge datasets, but it works for now.
Here’s the result for my the “Streaming History” data from above:
So for that file, Spotify tracks the end time of my listening session, artist and track name, and the number of ms (milliseconds) listened. It actually tracks the ms by each listening session, but I imported the data as a Total time listened instead. You can do it either way.
I was interested in what data is tracked in each file, so here’s a short summary of the headings after importing to Excel - I’ll include examples where it’s not private data.
The first 2 files, which for me are titled “DuoNewFamily” and “FamilyPlan”, are account data files, named that way because those are the plans I have subscribed to now and in the past. They are simple text files that only contain my billing address.
Next is “Follow”, which is a simple count of my followers and accounts I follow on Spotify:
“Identifiers” contains my email address.
“Identity” contains the following info from my profile:
“Inferences” contains what I assume are the categories that Spotify has lumped me into from my podcast listening. I don’t remember selecting these, so I think this is an automated thing. I didn’t screenshot the first couple of items because they’re kind of account-y.
I can see how what I’ve listened to in the past would lead Spotify to draw these conclusions - these look the categories of the podcasts I’ve listened to on Spotify.
“Marquee” is an interesting one. It contains hundreds of artist names, and it apparently assigns me a label based on how much I listen to that artist. It probably uses this for the Spotify Wrapped playlist and things like that, if I had to guess, but it’s not totally clear to me. It mostly contains “Previously Active Listeners” classifications though, I couldn’t find any “Currently Active Listeners" labels in there. Not sure why.
“Payments” contains my credit card info.
“Playlist1” is the first really interesting file. I imported it into Excel (sorry for the small text):
Now we’re getting somewhere. This list contains every song on every playlist I have, including the date added, and other information as shown above. This information would be super useful as I can keep backups of my Spotify playlists that I can always recreate on other services - it might take a little while, but it’s definitely doable. This is the kind of data I want to have my own copy of.
“Search Queries” is a history of my searches in Spotify. Not that useful, since most of the searches are things like “led z” where I would click the suggested result “Led Zeppelin” on the screen before I finish typing the search.
“StreamingHistory0” is my recent streaming history, which again is super useful. I’m still waiting for my extended streaming history, which Spotify says they will send separately, but I assume it will be the same kind of data (details on this file’s contents are above)
“UserAddress” is exactly what it sounds like. “UserData” contains more profile information like my gender, birthday, etc.
Finally, “YourLibrary” contains a list of all the tracks in my “liked” library, which again is great information to have.
Conclusions
The data you get from Spotify is actually pretty useful. It’s the info that feeds their algorithm, which we’ll never get our hands on as it’s so proprietary, but knowing what goes into the algo is useful. This data is also good to have if you want to control your music library - for example, if you change platforms, stop streaming completely, or just, like me, want to analyze the data yourself rather than letting Spotify do it for you.
Do they have other data about us? Undoubtedly. But at least you can download enough to make you less dependent on their platform, if that’s what you choose to do.
Thanks for writing this, Dave! I just created my first-ever Spotify account a few weeks ago, and I'm saving this article in case I end up deleting it.