The annual Noise Pop music festival starts this week, and I purchased a badge this year, which means I get to go to any show that’s a part of the festival without buying a dedicated ticket.
That means I have a lot of choices to make this week! I decided to use data to assess (and validate) some of the harder choices I needed to make, so I built a dashboard, “Who Should I See?” to help me out.
First off, the Wednesday night show. Albert Hammond, Jr. of the Strokes is playing, but more people are talking about the Baths show the same night. Maybe I should go see Baths instead?
If I’m making my decisions purely based on listen count, it’s clear that I’m making the right choice to see Albert Hammond, Jr. It is telling, though, that I’ve listened to Baths more recently than him, which might have contributed to my indecision.
The other night I’m having a tough time deciding about is Saturday night. Beirut is playing, but across the Bay in Oakland. Two other interesting artists are playing closer to home, Bob Mould and River Whyless. I wouldn’t normally care about this so much, but I know my Friday night shows will keep me busy and leave me pretty tired. So which artist should I go see?
It’s pretty clear that I’m making the right choice to go see Beirut, especially given my recent renewed interest thanks to their new album.
I also wanted to be able to consider if I should see a band at all! This isn’t as relevant this week thanks to the Noise Pop badge, but it currently evaluates if the number of listens I have for an artist exceeds the threshold that I calculate based on the total number of listens for all artists that I’ve seen live in concert. To do this, I’m evaluating whether or not an artist has more listens than the threshold. If they do, I return advice to “Go to the concert!” but if they don’t, I recommend “Only if it’s cheap, yo.”
Because I don’t need to make this decision for Noise Pop artists, I picked a few that I’ve been wanting to see lately: Lane 8, Luttrell, and The Rapture.
While my interest in Lane 8 has spiked recently, there still aren’t enough cumulative listens to put them over the threshold. Same for Luttrell. However, The Rapture has enough to put me over the threshold (likely due to the fact that I’ve been listening to them for over 10 years), so I should go to the concert! I’m going to see The Rapture in May, so I am gleefully obeying my eval statement!
On a more digressive note, it’s clear to me that this evaluation needs some refinement to actually reflect my true concert-going sentiments. Currently, the threshold averages all the listens for all artists that I’ve seen live. It doesn’t restrict that average to consider only the listens that occur before seeing an artist live, which might make it more accurate. That calculation would also be fairly complex, given that it would need to account for artists that I’ve seen multiple times.
However, number of listens over time doesn’t alone reflect interest in going to a concert. It might be useful to also consider time spent listening, beyond count of listens for an artist. This is especially relevant when considering electronic music, or DJ sets, because I might only have 4 listen counts for an artist, but if that comprises 8 hours of DJ sets by that artist that I’ve listened to, that is a pretty strong signal that I would likely enjoy seeing that artist perform live.
I thought that I’d need to get direct access to the MusicBrainz database in order to get metadata like that, but it turns out that the Last.fm API makes some available through their track.getInfo endpoint, so I just found a new project! In the meantime I am able to at least calculate duration for tracks that exist in my iTunes library.
I now have a new avenue to explore with this project, collecting that data and refining this calculation. Reach out on Twitter to let me know what you might consider adding to this calculation to craft a data-driven concert-going decision-making dashboard.
This past year has been pretty eventful in music for me. I’ve attended a couple new festivals, seen shows while traveling, and discovered plenty of new bands. I want to examine the data available to me and contrast it with my memories of the past year.
Spotify released its #2018wrapped campaign recently, sharing highlights from the year of my listening data with me (and in an ad campaign, aggregate data from all the users). As someone that uses Spotify but not as my exclusive source of music listening, I was curious to compare the results with my holistic dataset that I’ve compiled in Splunk.
Spotify’s top artists for me were somewhat different from the results that I found from the data I gather from Last.fm and analyze with Splunk software. Spotify and my holistic listening data agree that I listened to Poolside more than anyone else, and was also a big fan of Born Ruffians, but beyond that they differ. This is probably due to the fact that I bought music and when I’m mobile I switch my primary listening out of Spotify to song files stored on my phone.
In addition, my top 5 songs of the year were completely different from those listed in Spotify. My holistic top 5 songs of the year were all songs that I purchased. I don’t listen to music exclusively in Spotify, and my favorites go beyond what the service can recognize.
Spotify identified that I’ve listened to 30,473 minutes of music, but I can’t make a similarly reliable calculation with my existing data because I don’t have track length data for all the music that I’ve listened to. I can calculate the number of track listens so far this year, and based on that, make an approximation based on the track length data that I do have from my iTunes library. The minute calculation I can make indicates that I’ve so far spent 21,577 minutes listening to 3,878 of the 10,301 total listens I’ve accumulated so far this year (Numbers to change literally as this post is being written).
I’m similarly lacking data allowing me to determine my top genre of the year, but Indie is a pretty reliable genre for my taste.
Other Insights from 2018
I was able to calculate my Top 10 artists, songs, and albums of the year, and drill down on the top 10 artists to see additional data about them (if it existed) in my iTunes library, like other tracks, the date it was added, as well as the kind of file (helping me identify if it was purchased or not), and the length of the track.
There are quite a few common threads across the top 10 artists, songs, and albums, with Poolside, Young Fathers, Gilligan Moss, The Vaccines, and Justice making consistent appearances. The top 10 songs display obsessions with particular songs that outweigh an aggregate popularity for the entire album, leading other songs to be the top albums of the year.
Interestingly, the Polo & Pan album makes my top 10 albums while they don’t make it to my top 10 artist or song lists. This is also true for the album Dancehall by The Blaze. I’m not much of an album listener usually, but I know I listened to those albums several times.
The top 10 song list is more dominated by specific songs that caught my attention, and the top 10 artists neatly reflect both lists. The artists that have a bit more of a back catalog also reveal themselves, given that Born Ruffians managed to crack the top 10 despite not having any songs or albums make the top 10 lists, and Hey Rosetta! makes the top artist and album lists, despite having no top songs.
I purchased 285 songs this year, an increase of 157 compared to the year before. I think I just bought songs more quickly after first hearing them this year, and there are even some songs missing from this list that I bought on Beatport or Bandcamp because they weren’t available in the iTunes Store. While I caved in to Spotify premium this year, I still kept up an old promise to myself to buy music (rather than acquire it without paying for it, from a library or questionable download mechanisms) now that I can afford it.
A Year of Concerts
I’ve been to a lot of concerts so far this year. 48, to be exact. I spent a lot of money on concert tickets, both for the shows I attended this year and for shows that went on sale during 2018 (but at this point, might be happening in 2019). I often will buy tickets for multiple people, so this number isn’t very precise for my own personal ticket usage.
I managed to go to at least 2 concerts every month. By the time the year is over, I’m on track to go to 51 different shows. Based on the statistics, there are some months where I went to many more than 1 show per week, and others where I didn’t. Especially apparent are the months with festivals—February, August, and October all included festivals that I attended.
Many of those festivals brought me to new-to-me locations, with the Noise Pop Block Party and Golden Gate Park giving me new perspectives on familiar places, and Lollapalooza after shows bringing me out to Schubas Tavern for the first time in Chicago.
If you’re reading this wondering what San Francisco Belle is, it’s a boat. That’s one of several new venues that electronic music brought me to—DJ sets on that boat as part of Goldroom and Gigamesh’s tour, plus a day party in Bergerac and a nighttime set at Audio other times throughout the year.
Some of those new venue locations brought newly-discovered music to me as well.
The 20th-most-popular artist I discovered this year was Jenn Champion, who opened for We Were Promised Jetpacks at their show at the Great American Music Hall. I started writing this assuming that I hadn’t heard Jenn Champion before that night, but apparently I first discovered them on July 9, but the show wasn’t until October 9.
As it turns out, I listened to what is now my favorite song by Jenn Champion that day in July, likely as part of a Spotify algorithm-driven playlist (judging by the listening neighbors around the same time) but it didn’t stick until I saw them play live months later. The vagaries of playlists that refresh once a week can mean fleeting discoveries that you don’t really absorb.
Because of how I can search for things in Splunk, I was also curious to see what others songs I heard when I first discovered Hubert Kirchner, a great house artist.
I have really no idea what playlist I was listening to that might have led to me making jumps from Sofi Tukker, to Tanlines, to Dion, to Deradoorian, then to Hubert Kirchner, Miguel, How to Dress Well, Rihanna, Selena Gomez, and Descendents. Given that August 24th was a Friday, my best guess is perhaps that it was a Release Radar playlist, or perhaps an epic shuffle session.
For the top 20 bands I discovered in 2018, many of them I started listening to on Spotify, but not necessarily because of Spotify. Gilligan Moss was a discovery from a collaborative playlist shared with those that are also in a Facebook group about concert-going. I later saw them at one of the festivals I went to this year, and it even turned out that a friend knew one of the band members! Their status as my most-listened-to discovery of this year is very accurate.
Polo & Pan was a discovery from a friend, fully brought to life with a playlist built by Polo & Pan themselves and shared on Spotify. Spent some quality time sitting in a park listening to that playlist and just enjoying life. They were at the same festival as Gilligan Moss, playing the same day, making that day a standout of my concerts this year.
Karizma was a discovery from Jamie xx’s set at Outside Lands. I tracked down the song from the set with the help of several other people on the internet (not necessarily anyone I knew) and then the song that was from the set itself wasn’t even on Spotify itself (Spotify, however, did help me discover more of the artist’s back catalog, like my other favorite song ‘Nuffin Else) Apparently I was far behind the curve hearing the song from the set, since it came out in 2017 and was featured in a Chromebook ad, but Work It Out still made me lose my mind at that set. (For the record, so did Take Me Higher, a song I did not manage to track down at all, and have so much thanks for the person that messaged me on Facebook ages later to send me the link!)
Similarly, Luxxury was a DJ I first spotted on a cruise that I went on because it featured other DJs I had heard of from college, Goldroom and Gigamesh, whom I’d discovered through remixes of songs I downloaded from mp3 blogs like The Burning Ear.
~ Finding Meaning in the Platforms ~
Many of these discoveries were deepened by Spotify, or had Spotify as a vector—through a collaborative playlist, algorithmically-generated one, or the quick back-catalog access for a new artist—but don’t rely on Spotify as a platform. I prefer to keep my music listening habits platform-adjacent.
Spotify, SoundCloud, iTunes, Beatport and other music platforms I use help make my music experiences possible. But the artists making the music, performing live in venues that I have the privilege to live near and afford to visit, they are creating what keep my mind alive and energized.
The social platforms too, mediate the music-related experiences I’ve had, whether it’s with the people I share music and concert experiences with in a Facebook group, the people I exchange tracks and banter with in Slack channels, or those of you reading this on yet another platform.
I like to listen to music that moves me, physically, or that arrests my mind and takes me somewhere. More now than ever I realize that musical enjoyment for me is an intense instantiation of the continuous tension-and-release pattern that exists in so many human art forms. The waves of neatness that clash and collide in a house music track, or the soaring crescendos of harmonies.
It’s become clear to me over the years that I can’t separate my enjoyment of music from the platforms that bring me closer to it. Perhaps supporting the platforms in addition to the musical artists, performers, and venues, is just another element of contributing to a thriving music scene.
The American Top 40 chart includes more dance songs, more songs performed by DJs, and significantly more white artists than its counterpart, the Billboard charts.
Shit’s racist. I used to listen to the Ryan Seacrest Top 40 driving between Chicago and Michigan because it was one of the few things that I could listen to consistently along that entire drive on just a few radio stations. It wasn’t exactly quality radio, but it kept me awake.
The business meets somewhere at the crossroads of public relations and payola—a tradition as old as the music industry itself, historically used to define the illegal practice of record companies paying for commercial radio airtime. (Under U.S. law and FCC regulations, Payola is illegal on radio, but those laws do not apply to digital streaming platforms.) According to a 2015 Billboard article, a major-label marketing executive confirmed that pay-for-play is (or was) definitely happening.“According to a source, the price can range from $2,000 for a playlist with tens of thousands of fans to $10,000 for the more well-followed playlists.” And many are already calling the platform’s new “Sponsored Songs” endeavor a 2017 incarnation of payola.
I keep thinking I’ll get sick of Spotify thinkpieces but I’m not there yet. This one covers (in part) how Spotify structures their service to prioritize playlists over albums or other artist-created works, instead effectively reinstating payola and creating pay-to-playlists that then earn top billing all throughout the service. Me, I make my own playlists most of the time.
Everyone wants streaming music to be cheap or free for listeners, offer every song ever recorded, be made available on every device, be consistently lucrative for the industry, and give new and established artists robust support for new music. We all want snow that isn’t cold or wet. In principle, everyone is willing to pay, and everyone is willing to compromise, but no one is willing to compromise enough.
Womp womp. This is why for all of my use and support of services like Spotify and SoundCloud, now that I can afford it, I’m trying to buy the music that matters to me when possible. Less likely to disappear that way.
Confronting my own aversion to anger asked me to shift from seeing it simply as an emotion to be felt, and toward understanding it as a tool to be used: part of a well-stocked arsenal.
Leslie Jamison is one of my favorite essayists, and this is no exception.
I wrote two posts about analyzing my personal music data corpus. Reflecting on a decade of (quantified) music listening fits in with the rest of my blog posts about music, taking the personal tack to the quantified side of things. I also wrote up how I did all the analysis for my company blog, 10 Years of Listens: Analyzing My Music Data with Splunk. I’ve done some more analyses since these posts, like building something that lets me review the listening patterns for a specific artist compared with the dates that I’ve seen them in concert, and I’m working on analyzing if there is an average listen threshold before I see a band in concert (or not).
I also wrote about the importance that climbing has had in my life over the last year and a half in Finding Myself on the Wall. Grateful to get back on the wall tomorrow.
I took the time last year to start converting a dormant side project into a blogging series to share the links I’d collected. Calling it Borders on the Web, I post reminders of the borders that do exist on the web, as much as the techno-utopians in the world might like to pretend that they’re going away.
The trend in the last year or so toward more disco vibes has been… unexpectedly awesome. Going to see at least three of these artists live in the next few months… hoping to see more music from Thunder Jackson and Disco Despair soon too.
Some great DJ sets / mixtapes on here too. Seeing the xx live last year was a highlight, almost entirely because of Jamie xx. Realized that’s a show I’d pay more than I’d like to admit to go see if it were just him DJing. Haven’t managed to see Alex Cruz yet, though he’s been in the city a couple times since I’ve been here.
I used my music data to look up my favorite artists that I discovered in 2017. These are the ones that are the memorable favorites, beyond the statistical favorites.
This one is a surprise but a good reminder that small obsessions can make a big difference in overall statistics. I have The Burning Ear to thank for this discovery, and Spotify for entertaining it.
I discovered this artist because they’re touring as the headliner with Gibbz, who I was already familiar with. The groovy vibe of this artist took those tickets from a probable insta-purchase to an actual insta-purchase.
A discovery thanks to The Burning Ear, I discovered Jason Gaffner’s nu-disco grooves around the same time that I got obsessed with some songs by Gibbz (who I must’ve discovered in 2016). I bought this song soon after and am keeping an eye out for new releases.
I heard Alex Cruz for the first time when I was in Greece, listening to a set that my friend started playing. It took me three tries to figure out who she was talking about, and then I discovered a few of his sets that he puts out as the Deep and Sexy Podcast.
I can’t remember if I started listening to Perfume Genius because of Discover Weekly or the Song Exploder podcast, but damn they’re good. My only regret is that I discovered them too late to get tickets to their sold out show.
I don’t remember how I discovered this artist. I think it was an autoplay on SoundCloud after listening to some tracks The Burning Ear had posted? Either way, I fell in love with this remix.
I came across this band on The Burning Ear too. I think they’ll be around for Noise Pop next year so I’ll have to decide if I want to go see them. I’m mostly in love with this song.
He opened for the xx, so I checked out his Spotify page after I found out he was opening for them. Sweet, sweet grooves.
This guy showed up in my Discover Weekly playlist. I really like this song, but didn’t get as into the rest of his songs. Still a damn good song tho.
I enjoyed her song Little Brother so much that I got tickets to see her next year. I’ll be keeping an eye out for new releases from her as well.
Less notable discoveries:
I came across this band on SoundCloud through The Burning Ear again. This song was an easy purchase because it’s so catchy.
This artist showed up on my Discover Weekly playlist. Great for fans of Bon Iver.
This was another The Burning Ear discovery, and an easy purchase!
The Full List
The full list of 35 artists that had more than 10 listens each, first listened to in 2017:
A New Dawn
As Time Was Passing By
In Another Room
Invisible / Amenaza
It’s All Over
It’s All Over – John Talabot’s Stripped Refix
Of My Mind
The Way That You Like
Giving Up The Ghost
Oh Love, How You Break Me Up
Wear Your Demons Out
Feel Something (Garruda Remix)
Losing My Mind
Losing My Mind (3 Monkeyzz Remix)
Murder In The First Degree
Murder In The First Degree (Aristo G Remix)
Phantom (Keljet Remix)
When The Sun Goes Down
(No One Knows Me) Like The Piano
Beneath The Tree
Blood On Me
Take Me Inside
What Shouldn’t I Be?
Pull Me Up
Be Honest (Attom Remix)
Bleed Into The Water
Frustrated – Russ Macklin Remix
Hail the Underdog
In Slow Motion
On the Mountain by the Sea
People of the Future
When People Come Together
Alright (Karl Kling Remix)
Old Chunk of Coal
Sons Of Lightning (Super Duper Remix)
Waves That Rolled You Under (backstroke. Remix)
Cold to the Touch
Cold to the Touch – Nicolaas Remix
This Is Funky
Haunting – Original Mix
Haunting – Radio Edit
Haunting – Sebastien Radio Edit
Haunting – Sebastien Remix
Haunting [ANR063] – Sebastien Remix
Rubberband – Radio Edit
Shoreline – Extended Mix
Sweet Child – Club Mix
Sweet Child – Extended
Sweet Child – Original Mix
Five Hour Winnipeg
The Plural of Moose Is Moose
Crowd Goes Wild
Last Man Standing
Must Be Dreaming
Spinning on Blue
Stars Across the Sky
The Best Part
Body’s In Trouble – Recorded at Spotify Studios NYC
My favorite shows of 2017. Here’s to more great ones in 2018!
October 27, 2017: DJ Aaron Axelson, Lewis Ofman, Yelle
Rickshaw Stop, San Francisco CA
Popscene became my favorite concert sponsor this year, in no large part because of the skills of their DJs. This show surpassed my low expectations to be a great time of dancing and grooving and new music discoveries.
February 23, 2017: Rad Dad, Gibbz
The Hotel Utah Saloon, San Francisco CA
A local band opened for an undersung nu-disco artist, Gibbz. A great way to open p Noise Pop week 2017, and unexpectedly great sound quality for such a small space. Excited to see Gibbz play again next year.
September 19, 2017: NVDES, RAC
The Independent, San Francisco CA
RAC has put on a spectactularly dance-able show every time I’ve seen them. This most recent adventure did not disappoint.
April 16, 2017: Sampha, The XX
Bill Graham Civic Auditorium, San Francisco CA
I would pay Jamie XX to DJ my life, but I can’t afford it. I could afford this show, though. It was incredible. Sampha was great too. Highlight: a mirror that appeared partway through the set that gave the audience a view of Jamie XX’s DJing and his dorky dance moves.
September 13, 2017: The Dirty Nil, Bleached, Against Me!
Regency Ballroom, San Francisco CA
Just as good as they were 10 years ago when I saw them in Chicago, if not better. A restorative and energetic show.
February 4 2017: Wheatus, Mike Doughty
The Independent, San Francisco CA
Wheatus played old hits and new jams, and Mike Doughty pulled them out to back him as he played a bunch of Soul Coughing songs. I was there more for his solo songs, but the artistry and adventure of his live conducting of the band behind him made for an incredible show that was supremely groove-able.
I recently crossed the 10 year mark of using Last.fm to track what I listen to.
From the first tape I owned (Train’s Drops of Jupiter) to the first CD (Cat Stevens Classics) to the first album I discovered by roaming the stacks at the public library (The Most Serene Republic Underwater Cinematographer) to the college radio station that shaped my adolescent music taste (WONC) to the college radio station that shaped my college experience (WESN), to the shift from tapes, to CDs, (and a radio walkman all the while), to the radio in my car, to SoundCloud and MP3 music blogs, to Grooveshark and later Spotify, with Windows Media Player and later an iTunes music library keeping me company throughout…. It’s been quite a journey.
Some, but not all, of that journey has been captured while using the service Last.fm for the last 10 years. Last.fm “scrobbles” what you listen to as you listen to it, keeping a record of your listening habits and behaviors. I decided to add all this data to Splunk, along with my iTunes library and a list of concerts I’ve attended over the years, to quantify my music listening, acquisition, and attendance habits. Let’s go.
What am I doing?
Before I get any data in, I have to know what questions I’m trying to answer, otherwise I won’t get the right data into Splunk (my data analysis system of choice, because I work there). Even if I get the right data into Splunk, I have to make sure that the right fields are there to do the analysis that I wanted. This helped me prioritize certain scripts over others to retrieve and clean my data (because I can’t code well enough to write my own).
I also made a list of the questions that I wanted to answer with my data, and coded the questions according to the types of data that I would need to answer the questions. Things like:
What percentage of the songs in iTunes have I listened to?
What is my artist distribution over time? Do I listen to more artists now? Different ones overall?
What is my listen count over time?
What genres are my favorite?
How have my top 10 artists shifted year over year?
How do my listening habits shift around a concert? Do I listen to that artist more, or not at all?
What songs did I listen to a lot a few years ago, but not since?
What personal one hit wonders do I have, where I listen to one song by an artist way more than any other of their songs?
What songs do I listen to that are in Spotify but not in iTunes (that I should buy, perhaps)?
How many listens does each service have? Do I have a service bias?
How many songs are in multiple services, implying that I’ve probably bought them?
What’s the lag between the date a song or album was released and my first listen?
What geographic locations are my favorite artists from?
As the list goes on, the questions get more complex and require an increasing number of data sources. So I prioritized what was simplest to start, and started getting data in.
Getting data in…
I knew I wanted as much music data as I could get into the system. However, SoundCloud isn’t providing developer API keys at the moment, and Spotify requires authentication, which is a little bit beyond my skills at the moment. MusicBrainz also has a lot of great data, but has intense rate-limiting so I knew I’d want a strategy to approach that metadata-gathering data source. I was left with three initial data sources: my iTunes library, my own list of concerts I’ve gone to, and my Last.fm account data.
Last.fm provides an endpoint that allows you to get the recent tracks played by a user, which was exactly what I wanted to analyze. I started by building an add-on for Last.fm with the Splunk Add-on Builder to call this REST endpoint. It was hard. When I first tried to do this a year and a half ago, the add-on builder didn’t yet support checkpointing, so I could only pull in data if I was actively listening and Splunk was on. Because I had installed Splunk on a laptop rather than a server in ~ the cloud ~, I was pretty limited in the data I could pull in. I pretty much abandoned the process until checkpointing was supported.
After the add-on builder started supporting checkpointing, I set it up again, but ran into issues. Everything from forgetting to specify the from date in my REST call to JSON path decision-making that meant I was limited in the number of results I could pull back at a time. I deleted the data from the add-on sourcetype many times, triple-checking the results each time before continuing.
I used a python script (thanks Reddit) to pull my historical data from Last.fm to add to Splunk, and to fill the gap between this initial backfill and the time it took me to get the add-on working, I used an NPM module. When you don’t know how to code, you’re at the mercy of the tools other people have developed. Adding the backfill data to Splunk also meant I had to adjust the max_days_ago default in props.conf, because Splunk doesn’t necessarily expect data from 10+ years ago by default. 2 scripts in 2 languages and 1 add-on builder later, I had a working solution and my Last.fm data in Splunk.
To get the iTunes data in, I used an iTunes to CSV script on Github (thanks StackExchange) to convert the library.xml file into CSV. This worked great, but again, it was in a language I don’t know (Ruby) and so I was at the mercy of a kind developer posting scripts on Github again. I was limited to whatever fields their script supported. This again only did backfill.
I’m still trying to sort out the regex and determine if it’s possible to parse the iTunes Library.xml file in its entirety and add it to Splunk without too much of a headache, and/or get it set up so that I can ad-hoc add new songs added to the library to Splunk without converting the entries some other way. Work in progress, but I’m pretty close to getting that working thanks to help from some regex gurus in the Splunk community.
For the concert data, I added the data I had into the Lookup File Editor app and was up and running. Because of some column header choices I made for how to organize my data, and the fact that I chose to maintain a lookup rather than add the information as events, I was up for some more adventures in search, but this data format made it easy to add new concerts as I attend them.
Answer these questions…with data!
I built a lot of dashboard panels. I wanted to answer the questions I mentioned earlier, along with some others. I was spurred on by my brother recommending a song to me to listen to. I was pretty sure I’d heard the song before, and decided to use data to verify it.
I’d first heard the song he recommended to me, Waiting on the Summer, in March. Hipster credibility: intact. Having this dashboard panel now lets me answer the questions “when was the first time I listened to an artist, and which songs did I hear first?”. I added a second panel later, to compare the earliest listens with the play counts of songs by the artist. Maybe the first song I’d heard by an artist was the most listened song, but often not.
Another question I wanted to answer was “how many concerts have I been to, and what’s the distribution in my concert attendance?”
It’s pretty fun to look at this chart. I went to a few concerts while I was in high school, but never more than one a month and rarely more than a few per year. The pace picked up while I was in college, especially while I was dating someone that liked going to concerts. A slowdown as I studied abroad and finished college, then it picks up for a year as I get settled in a new town. But after I get settled in a long-term relationship, my concert attendance drops off, to where I’m going to fewer shows than I did in high school. As soon as I’m single again, that shifts dramatically and now I’m going to 1 or more show a month. The personal stories and patterns revealed by the data are the fun part for me.
I answered some more questions, especially those that could be answered by fun graphs, such as what states have my concentrated music listens?
It’s easy to tell where I’ve spent most of my life living so far, but again the personal details tell a bigger story. I spent more time in Michigan than I have lived in California so far, but I’ve spent more time single in California so far, thus attending more concerts.
Speaking of California, I also wanted to see what my most-listened-to songs were since moving to California. I used a trellis visualization to split the songs by artist, allowing me to identify artists that were more popular with me than others.
I really liked the CHVRCHES album Every Open Eye, so I have three songs from that album. I also spent some time with a four song playlist featuring Adele’s song Send My Love (To Your New Lover), Ariana Grande’s Into You, Carly Rae Jepsen’s Run Away With Me, and Ingrid Michaelson’s song Hell No. Somehow two breakup songs and two love songs were the perfect juxtaposition for a great playlist. I liked it enough to where all four songs are in this list (though only half of it is visible in this screenshot). That’s another secret behind the data.
I also wanted to do some more analytics on my concert data, and decided to figure out what my favorite venues were. I had some guesses, but wanted to see what the data said.
The Metro is my favorite venue in Chicago, so it’s no surprise that it came in first in the rankings (I also later corrected the data to make it its proper name, “Metro” so that I could drill down from the panel to a Google Maps search for the venue). First Midwest Bank Ampitheatre hosted Warped Tour, which I attended (apparently) 5 times over the years. Since moving to California it seems like I don’t have a favorite venue based on visits alone, but it’s really The Independent, followed by Bill Graham Civic Auditorium, which doesn’t even make this list. Number of visits doesn’t automatically equate to favorite.
But what does it MEAN?
I could do data analysis like that all day. But what else do I learn by just looking at the data itself?
I can tell that Last.fm didn’t handle the shift to mobile and portable devices very well. It thrives when all of your listening happens on your laptop, and it can grab the scrobbles from your iPod or other device when you plug it into your computer. But as soon as internet-connected devices got popular (and I started using them), listens scrobbled overall dropped. In addition to devices, the rise of streaming music on sites like Grooveshark and SoundCloud to replace the shift from MediaFire-hosted and MegaUpload-hosted free music shared on music blogs also meant trouble for my data integrity. Last.fm didn’t handle listens on the web then, and only handles them through a fragile extension now.
Distinct songs and artists listened to in Last.fm data.But that’s not the whole story. I also got a job and started working in an environment where I couldn’t listen to music at work, so wasn’t listening to music there, and also wasn’t listening to music at home much either due to other circumstances. Given that the count plummets to near-zero, it’s possible there were also data issues at play. It’s imperfect, but still fascinating.
What else did I learn?
I have a lot of songs in my iTunes library. I haven’t listened to nearly 30% of them. I’ve listened to nearly 25% of them only once. That’s the majority of my music library. If I split that by rating, however, it would get a lot more interesting. Soon.
You can’t see the fallout from my own personal Music-ocalypse in this data, because the Library.xml file doesn’t know which songs don’t point to actual files, or at least my version of it doesn’t. I’ll need more high-fidelity data to determine the “actual” size of my library, and perform more analyses.
I need more data in general, and more patience, to perform the analyses to answer the more complex questions I want to answer, like my listening habits of particular artists around a concert. As it is, this is a really exciting start.
Benedict Evans pointed out in a recent newsletter, “there’s a story to be written about Apple feeling its way from a piecemeal legacy technology stack for services, evolved bit by bit from the old iPod music store of a decade ago, to an actual new unified platform, something that it is apparently building.”
I’d argue for a focused set of decoupled applications, rather than a new unified platform. iTunes has bloated beyond practicality. The App store doesn’t work well for users or developers. Here’s where I think the future of these applications lies.
I used to subscribe to lots of MP3 blogs. I had lots of free time in high school, and listened fervently to the local college radio station (as I’ve mentioned before, in an autobiography through musical devices.) Music discovery is now fragmented across services—SoundCloud, Spotify, iTunes, Pandora, the now-defunct Rdio, and even 8tracks)—it’s both harder and easier to find new music. The wizardry of Shazam, too, means getting to find out what song is playing in the bar, store, or on the radio so you can buy it or find it later online.
Librarians are an underused, underpaid, and underestimated legion. And one librarian in particular is frustrated by e-book lending. Not just the fact that libraries have to maintain waitlists for access to a digital file, but also that the barriers to checking out an ebook are unnecessarily high. As she puts it,
“Teaching people about having technology serve them includes helping them learn to assess and evaluate risk for themselves.”
In her view,
“Information workers need to be willing to step up and be more honest about how technology really works and not silently carry water for bad systems. People trust us to tell them the truth.”
That seems like the least that can be expected by library patrons.