Wrapping up 2020: Spotify, SoundCloud, and Last.fm data

Another year, another Spotify Wrapped campaign, another effort to analyze the music data that I collect and compare it to what Spotify produces. This year I have last.fm listening habit data, concert attendance and ticket purchase data, livestream view activity data, my SoundCloud 2020 Playback playlist, and the tracks on my Spotify top 100 songs of 2020 playlist

Screenshot of Spotify Wrapped header image, top artists of disclosure, lane 8, kidnap, tourist, and amtrac, top songs of apricots, atlas, idontknow, cappadocia, know your worth, minutes listened of 59,038 and top genre of house.

It’s always important to point out that the data covered in the Spotify Wrapped campaign only covers the time period from January 1st, 2020 to October 31st, 2020. I discuss the effects of this misleading time period in Communicate the data: How missing data biases data-driven decisions. Of course, writing this post on December 2nd, nearly the entire month of December is missing from my own analyses. I’ll follow up (on Twitter) about any data insights that change over the next few weeks.

Top Artists of the Year

screenshot of spotify wrapped top artists, content duplicated in surrounding text.

Spotify says my Top 5 artists of the year are: 

  1. Disclosure
  2. Lane 8 
  3. Kidnap
  4. Tourist 
  5. Amtrac

My own data shows some slight permutations.

Screenshot of Splunk table showing top 10 artists in order: tourist with 156 listens, amtrac with 155 listens, booka shade with 147 listens, jacques greene with 134 listens, lane 8 with 129 listens, bicep with 128 listens, kidnap with 114 listens, ben böhmer with 111 listens, cold war kids with 110 listens, and sjowgren with 99 listens

My top 5 artists are nearly the same, but much more influenced by music that I’ve purchased. The overall list instead looks like:

  1. Tourist
  2. Amtrac
  3. Booka Shade
  4. Jacques Greene
  5. Lane 8

For the second year in a row, Tourist is my top artist! Kidnap still makes it into the top 10, as my 7th most-listened-to artist so far of 2020. 

Disclosure, somewhat hilariously, doesn’t even break the top 10 artists if I am relying on Last.fm data instead of only Spotify. What’s going on there? Turns out Disclosure is my 11th-most-listened to artist, with 97 total listens so far this year. If I dig a little bit deeper, looking at the song Know Your Worth which Spotify says I’ve listened to the most in 2020 by Disclosure, I can see exactly why this is happening.

Screenshot showing the track_name Know Your Worth listed 5 times, with different artist permutations each time, Khalid, Disclosure & Khalid, Disclosure & Blick Bassy, Khalid & Disclosure, and Khalid, with total listens of 20 for all permutations.

Disclosure’s latest album, ENERGY, includes a number of collaborations. Disclosure is the main artist for most of these tracks, but in some cases (like with Know Your Worth, which came out as a single February 4, 2020) the artist can be inconsistently stored by different services.

As a result, the Last.fm data has a number of different entries for the same track, with differently-listed artists for each one. Last.fm stores only one artist per track, whereas Spotify stores an array of artists for each track. This data structure decision means that Disclosure should have had about 127 total listens, and been my 7th-most-listened-to artist of 2020, instead of 11th. 

This truncated screenshot shows some examples of the permutations of data that exist in my Last.fm data collection, with a total listen count of 127 for Disclosure during 2020. 

Screenshot showing additional permutations of Disclosure artist data, such as Disclosure & slowthai, Disclosure & Common, and Disclosure & Channel Tres.

I had a sneaking suspicion that my Booka Shade listening habits are primarily concentrated on a few songs from an EP that he put out this year, so I dug into how many tracks my total listens for the year were spread across.

Table showing top 10 artists and total listens, with total tracks for each artist as well. Tourist has 62 tracks for 159 listens, Amtrac has 59 tracks for 155 listens, Booka Shade has 64 tracks for 147 listens, Jacques Greene has 33 tracks for 134 listens, Lane 8 has 60 tracks for 129 listens, Bicep has 46 tracks for 128 listens, Kidnap has 35 tracks for 114 listens, Ben Böhmer has 51 tracks for 111 listens, Cold War Kids has 53 tracks for 110 listens, and sjowgren has 15 tracks for 99 listens.

Instead, it turns out that my listens to Booka Shade are actually the most distributed across tracks of all of my top 10 artists. Sjowgren is also an outlier here, because they’ve never released an album, so they only have 15 songs in their overall discography yet still made the top 10 artist listens. 

Returning my comparison between Spotify and Last.fm data, Amtrac and Lane 8 are in both top 5 lists. This is somewhat expected, because if I look at the top 10 list for artists that I’ve most consistently listened to—artists that I’ve listened to at least once in each month of 2020—both Amtrac and Lane 8 place high in that list. 

Screenshot of a table showing top 10 consistently listened to artists, with Lane 8 being listened to at least once in all 12 months of 2020, Amtrac 11 months, Caribou 11 months, Disclosure 11 months, Elderbrook 11 months, Kidnap 11 months, Kölsch 11 months, Tourist 11 months, Ben Böhmer 10 months, and CamelPhat for 10 months.

Given that only 2 days of December have happened as I write this, it’s unsurprising that I’ve only listened to one artist in every month of 2020. 

Top Songs of 2020

Enough about the artists—what about the songs? 

Screenshot of top 5 songs from spotify wrapped, duplicated in surrounding text.

According to Spotify, my Top 5 songs of the year are:

  1. Apricots by Bicep
  2. Atlas by Bicep
  3. Idontknow by Jamie xx
  4. Cappadocia by Ben Böhmer feat. Romain Garcia
  5. Know Your Worth by Disclosure feat. Khalid

That pretty closely matches my top 5 list according to Last.fm, with some notable exceptions.

Screenshot of Splunk table with top 10 songs of last.fm data, Apricots by Bicep with 38 listens, Atlas by Bicep with 32 listens, Idontknow by Jamie xx with 22 listens, White Ferrari (Greene Edit) by Jacques Greene with 21 listens, That Home Extended by The Cinematic Orchestra with 20 listens, Lalala by Y2K and bbno$ with 19 listens, Trish's Song by Hey Rosetta! with 18 listens, Wonderful by Burna Boy with 18 listens, Somewhere feat. Octavian by the Blaze with 17 listens, and Yes, I Know by Daphni with 17 listens.

My top 5 tracks according to Last.fm are:

  1. Apricots by Bicep (38 listens)
  2. Atlas by Bicep (32 listens)
  3. Idontknow by Jamie xx (22 listens)
  4. White Ferrari (Greene Edit) by Jacques Greene (21 listens)
  5. That Home Extended by The Cinematic Orchestra (20 listens)

The first 3 tracks match, though of course Spotify has an incomplete representation of those listens—I have 29 streams of Apricots according to Spotify.

However, since I bought the track almost as soon as it came out, I also have another 9 listens that have happened off of Spotify. There were also some mysterious things happening with Spotify and Last.fm connections around that time as well, so it’s possible some listens are missing beyond these numbers. 

What’s up with the 4th track on the list, though? Where is that in Spotify’s data? It’s actually a bootleg remix of the Frank Ocean song White Ferrari that Jacques Greene shared on SoundCloud and as a free download earlier this year, so it isn’t anywhere on Spotify. It did, however, make it onto my top tracks of 2020 on SoundCloud:

Screenshot of top 13 tracks in SoundCloud, with Jacques Greene - White Ferrari (JG Edit) listed as the 11th track.

And again, this is a spot where metadata intrudes again and leads to some inconsistent counts. If I look at all the permutations of White Ferrari and Jacques Greene in my data for 2020, the total number of listens should actually be a bit higher, at 23 total listens:

Screenshot of Splunk table showing the two permutations of the Jacques Greene remix, with 21 listens for the Greene Edit version and 2 listens for the JG Edit version, for a total of 23.

This would actually make it my 3rd-most popular song of 2020 so far, and I’m listening to it as I write this paragraph, so let’s go ahead and call that total number 24 listens. 

The 5th-most popular song and 7th-most popular song of 2020 make the case that I haven’t been sleeping very well this year (though I recall these tracks also showed up in 2019 as well…), because those 2 tracks comprise my “Insomnia” playlist that I use to help me fall asleep on nights when I’ve been, perhaps, staying up too late doing data analysis like this. 

You can see the influence of consistent listening habits with top artist behaviors when you look at the top 10 songs that I’ve consistently listened to throughout 2020, with 2 songs by Kidnap, one by Bicep, and another by Amtrac.

Table of tracks listened to consistently in 2020, Never Come Back by Caribou listened to at least once in 8 months of 2020, Start Again by Kidnap with 8 months, Accountable by Amtrac with 7 months, Atlas by Bicep with 7 months, Calling out by Sophie Lloyd with 7 months, Made to Stray by Mount Kimbie for 7 months, Moments (Ben Böhmer Remix) by Kidnap with 7 months, Somewhere feat. Octavian by the Blaze with 7 months, The Promise by David Spinelli with 7 months, and Without You My Life Would Be Boring by The Knife with 7 months.

To me, though, this table mostly underscores how much music discovery this year involved. I didn’t return to the same songs month after month during 2020. Likely as a result of all the DJ sets I’ve been streaming (as I mentioned in my post about Listening to Music while Sheltering in Place) this has been quite a year for music discovery, and breadth of listening habits. 

My top 10 songs of 2020 had a total of 222 listens across them. However, I have a total of 14,336 listens for the entire year, spread across 8,118 unique songs in total.

duplicated in surrounding text

Even with possible metadata issues, that’s still quite the distribution of behavior. Let’s dig a bit deeper into artist discovery this year. 

Artist Discovery in 2020

In my post earlier this year about my listening behavior while sheltering in place, I discovered that my artist discovery numbers in 2020 seemed to be way up compared with 2018 and 2019, but weren’t actually that far off from 2017 numbers. 

What I see when comparing my 2020 artist discovery statistics from my Last.fm data and my Spotify data is even more interesting. In contrast to what seemed to be true in last year’s post, Wrapping up the year and the decade in music: Spotify vs my data (For what it’s worth, last year’s number should have been 1074, instead of 2857 artists discovered—data analysis is difficult), Spotify’s data is much higher than the number I calculated this year. 

duplicated in surrounding text

According to Spotify, I discovered 2,051 new artists, whereas my Last.fm data claims that I only discovered 1,497 artists this year. 

duplicated in surrounding text

Similarly, Spotify claims that I listened to 4,179 artists this year, whereas my Last.fm data indicates that I listened to 3,715 artists. 

duplicated in surrounding text

Again, this comes down to data structures and how the artist metadata is stored for each service. I wrote about the importance of quality metadata for digital streaming providers earlier this year in Why the quality of audio analysis metadatasets matters for music, but it’s also apparent that the data structures for those metadatasets are just as important for crafting data insights of varying value. 

Because Spotify stores all artists that contributed to a track as an array, I can listen to a track with 4 contributing artists on it, 1 of which I’ve listened to before, and according to Spotify, I’ve now discovered 3 artists and listened to 4, whereas according to Last.fm, I’ll have either listened to 1 artist that I’ve already heard before, or a new artist, possibly called “Luciano & David Morales”. 

Screenshot of two artist names, Luciano, and Luciano & David Morales.

Spotify would store the second artist as Luciano, David Morales, thus allowing a more accurate count of listens for the Luciano artist. Similarly, my artist discovery data includes some flawed data, such as YouTube videos that got incorrectly recorded.

Screenshot of 3 artist names in my data, Billie Joe Armstrong of Green Day, Billy Joel and Jimmy Fallon Form 2, and Biosphere.
The Billy Joel and Jimmy Fallon duet of The Lion Sleeps Tonight never gets old, but it appears the original video is no longer on YouTube so I’m not going to link it.
Screenshot of two artist names in my data, &lez and 'Coming of age ceremony' Dance cover by Jimin and Jung Kook.

This becomes clear in my top 20 artist discoveries of 2020 chart, where BTS and Big Hit Labels are listed separately, although they are both indicative of one of my best friends joining BTS ARMY this year and sharing her enthusiasm with me. 

Giant table of top 20 artists discovered in 2020, in order with first_discovered date last:
Re.You with 85 listens starting July 12, 2020
Elliot Adamson, 75 listens, April 15 2020
Fennec, 53 listens, March 24 2020
Southern Shores, 52 listens, November 19 2020
Eelke Kleijn, 45 listens, August 10 2020
Christian Löffler, 43 listens, April 2 2020
Icarus, 35 listens, April 2 2020
Monkey Safari, 35 listens, April 15 2020
Black Motion, 34 listens, April 30 2020
BTS, 31 listens, September 29 2020
Bronson, 31 listens, May 9 2020
Love Regenerator, 30 listens, March 30 2020
Eltonnick, 29 listens, April 27 2020
Jerro, 27 listens, April 29 2020
Theo Kottis, 27 listens, June 16 2020
Dennis Cruz, 26 listens, June 22 2020
Da Capo, 25 listens, May 10 2020
Bit Hit Labels, 21 listens, June 30 2020
HYENAH, 20 listens, June 4 2020
KC Lights, 20 listens, September 22 2020

Ultimately I’m grateful that the top 20 artists of 2020 are all artists that I discovered during the pandemic and have excellent songs that I love and continue to listen to. Many of the sparklines that represent my listening activity for these artists throughout the year have spikes, but mostly my listening patterns indicate that I’ve been returning to these artists and their songs multiple times after first discovery. Some notable favorites on this list are KC Lights’ track Girl and Dennis Cruz’s track El Sueño, plus the entire Fennec album Free Us Of This Feeling.

Genre Discovery in 2020

The most-commented-on data insight from #wrapped2020 is probably the genre discovery slide.

According to Spotify, I listened to 801 genres this year, including 294 new ones. I’m not even sure I could name 30 genres, let alone 300 or 800. Where are these numbers coming from? 

It turns out that, much like storing artist data as an array for each song, Spotify stores genre data as an array for each artist. This means that each artist can be assigned multiple genres, thus successfully inflating the number of genres that you’ve listened to in 2020. 

For example, if I use Spotify’s API developer console to retrieve the artist information for Tourist, with a Spotify ID of 2ABBMkcUeM9hdpimo86mo6, it turns out that he has 6 total genres associated with him in Spotify’s database: chillwave, electronica, indie soul, shimmer pop, tropical house, and vapor soul. 

Screenshot of JSON response from Spotify API call, content duplicated in surrounding text.

I could start discussing the possible meaningless of genres as a descriptive tool, the lack of validation possible for such a signifier, the lack of clarity about how these genres were defined and also assigned to specific artists, but that’s best for another blog post.

Instead, let’s look at what little genre data I do have available to me more generally. 

duplicated in surrounding text

According to Spotify, my top genres were:

  1. House
  2. Electronica
  3. Pop
  4. Afro House
  5. Organic House

All of these make sense to me, except for Organic House, because I don’t know what makes house music organic, unless it’s also grass-fed, locally-sourced, and free range. Perhaps Blond:ish is organic house. 

I don’t have any genre data from Last.fm, since the service only stores user-defined tags for each artist, and those are not included in the data that I collect from Last.fm today. Instead, I have the genres assigned by iTunes for the tracks that I’ve purchased from the iTunes store. 

The top 8 genres of music that I added to my iTunes library in 2020 by purchasing tracks from the iTunes store are:

  1. Dance (124 songs)
  2. Electronic (121 songs)
  3. House (78 songs)
  4. Pop (37 songs)
  5. Alternative (27 songs)
  6. Electronica (12 songs)
  7. Deep House (10 songs)
  8. Melodic House & Techno (9 songs)
duplicated in surrounding text

Clearly, this is a very selective sample, and is only tied to select purchasing habits, which are roughly correlated to my listening habits.

I shared all of this genre data to essentially look at it and go “wow, that wasn’t very insightful at all”. Let’s move on. 

Time Spent Listening to Music in 2020

The last metric I want to unpack from Spotify’s #wrapped2020 campaign is the minutes listened data insight. According to Spotify, I spent 59,038 minutes listening to music this year. 

relevant content duplicated in surrounding text

According to my own calculations, I spent roughly 81,134 minutes listening to music in 2020.

Let’s talk about how both of these metrics are super flawed!

Spotify counts a song as streamed after you listen to it for more than 30 seconds (per their Spotfiy for Artists FAQ), so it’s logical to assume that this minutes listened metric likely from a calculation of “number of streams for a track” x “length of track” and then rounded and converted to minutes. It could even result from an different type of calculation, “number of total streams” x “average length of track in Spotify library”, but I have no way of knowing if either of these are accurate besides tweeting at Spotify and hoping they’ll pay attention to me. 

Unfortunately for all of us, but mostly me, my own minutes listened metric is just as lazily calculated. I don’t have track length data for all the tracks that I listen to and I don’t know at what point Last.fm counts a track as being worthy of a scrobble. I do have a list of how much time I spent listening to livestreamed DJ sets online, and I do have some excellent estimation skills. I calculated my number of 81,134 minutes so far in 2020 by calculating and assuming the following:

  • An average track length of 4 minutes
  • An average concert length of 3 hours
  • An average DJ set length of 4 hours
  • An average festival length of 8 hours

Using those averages and estimates, I calculated the total amount of time I spent listening to music across Last.fm listening habits, concerts and DJ sets attended (no festivals this year), and livestreams that I watched online, thus arriving at 81,134 minutes. That doesn’t count any DJ sets that I listened to on SoundCloud, and certainly the combination of a 4 minute track length estimate with the uncertainty of what qualifies a track as being scrobbled makes this data insight somewhat meaningless.

Regardless, let’s compare this estimated time spent listening in minutes against the total number of minutes in a year.

Total minutes listened (81,134) as a gauge compared with total minutes in a year (525,600)

Beautiful. I still remembered to sleep this year. No matter which dataset I use, however, it’s clear that I’ve listened to more music in 2020 than in 2019. Spotify’s metric for this same time period in 2019 was 35,496 minutes. The less-flawed but less-complete metric I used last year, calculated using the track length stored in iTunes multiplied by the number of listens for that track, indicated that I spent 14,296 minutes listening to music in 2019. 

As one final Spotify examination, let’s dig into the Spotify Top 100 playlist.

Top 100 Songs of 2020 Playlist

Alongside the fancy graphics and data insights in the #wrapped2020 campaign, Spotify also creates a 100 song playlist, likely (but not definitively) the top 100 songs of the time period between January 1st, 2020 and October 31st, 2020. 

I found my playlist this year to be relatively accurate, perhaps because I spent more time listening to Spotify than I might have in previous years, or perhaps they made some internal data improvements, or both! I often spend more time listening to SoundCloud if I’m traveling a lot, listening to offline DJ sets on plane flights; or listening to Apple Music on my iPhone, with songs that I’ve added from my iTunes library. Without much time spent commuting or traveling this year, it’s likely that my listening habits remained fairly consolidated. 

duplicated in surrounding text

Similarly to what I discovered about my top 10 tracks, I had relatively distributed music interests this year. The 811 total listens for all 100 songs in my Spotify playlist represent just 0.06% of my total listens in 2020 so far. 

duplicated in surrounding text

Despite my overall listening habits being relatively distributed across lots of artists and songs, the Top Songs playlist is somewhat more consolidated, with 69 artists performing the 100 songs on the playlist. Nice. 

duplicated in surrounding text

It’s clear that I spent most of this year exploring and discovering new artists, given that 83 of my top songs of 2020 according to Spotify were songs that I discovered in 2020. 

Thanks for coming on this journey through my music data with me. I’ll be back at the actual end of the year to dive deeper into my top 10 artists of the year, top 10 consistent artists of the year, my music purchasing activity, as well as some more livestream and concert statistics to round out my 2020 year in music. 

Listening to Music while Sheltering in Place

The world is, to varying degrees, sheltering-in-place during this global coronavirus pandemic. Starting in March, the pandemic started to affect me personally: 

  • I started working from home on March 6th. 
  • Governor Gavin Newsom announced on March 11 that any gatherings over 250 people were strongly discouraged, effectively cancelling all concerts for the month of March. 
  • On March 16th, the mayor of San Francisco along with several other counties in the area, announced a shelter-in-place order. 

Ever since then, I’ve been at home. Given all these changes in my life, I was curious what new patterns I might see in my music listening habits. 

With large gatherings prohibited, I went to my last concert on March 7th. With gatherings increasingly cancelled nationwide, and touring musicians postponing and cancelling events, March 27th, Beatport hosted the first livestream festival, “ReConnect. A Global Music Series”. Many more followed. 

Industry-wide studies and data analysis have attempted to unpack various trends in the pandemic’s influence on the music industry. Analytics startup Chartmetric is digging into genre-based listening, geographical listening habits, and Billboard and Nielsen conducting a periodic entertainment tracker survey.

Because I’m me, and I have so much data about my music listening patterns, I wanted to explore what trends might be emerging in my personal habits. I analyzed the months March, April, and May during 2020, and in some cases compared that period against the same period in 2019, 2018, and 2017. The screenshots of data visualizations in this blog post represent data points from May 15th, so it is an incomplete analysis and comparison, given that May in 2020 is not yet complete. 

Looking at my listening habits during this time period, with key dates highlighted, it’s clear that the very beginning of the crisis didn’t have much of an effect on my listening behavior. However, after the shelter-in-place order, the amount of time I spent listening to music increased. After that increase it’s remained fairly steady.

Screenshot of an area chart depicting listening duration ranging from 100 minutes with a couple spikes of 500 minutes but hovering around a max of 250 minutes per day for much of january and february, then starting in march a new range from about 250 to 450 minutes per day, with a couple outliers of nearly 700 minutes of listening activity, and a couple outliers with only a 90 minutes of listening activity.

Key dates such as the first case in the United States, the first case in California, and the first case in the Bay Area are highlighted along with other pandemic-relevant dates.

Listening behavior during March, April, and May over time

When I started my analysis, I looked at my basic listening count from traditional music listening sources. I use Last.fm to scrobble my listening behavior in iTunes, Spotify, and the web from sites like YouTube, SoundCloud, Bandcamp, Hype Machine, and more. 

Chart depicting 2700 total listens for 2017, 2000 total listens for 2018, and 2300 total listens for 2019 during March, April, and May, compared to 3000 total listens in that same period in 2020.

If you just look at 2018 to 2020, it seems like my listening habits are trending upward, maybe with a culmination in 2020. But comparing against 2017, it isn’t much of a difference. I listened to 25% fewer tracks in 2018 compared with 2017, 19% more tracks in 2019 compared with 2018, and 25% more tracks in 2020 compared with 2019. 

Chart depicting total weekday listens during March, April, and May during 2017, 2018, 2019, and 2020 with total weekend listens during the same time. 2017 shows roughly 2400 listens on weekdays and 200ish for 2017, 2000 weekday listens vs 100 weekend listens for 2018, 2100 weekday listens vs 300 weekend listens in 2019, and 2500 weekday listens vs 200 weekend listens in 2020

If I break that down by when I was listening by comparing my weekend and weekday listening habits from the previous 3 years to now, there’s still perhaps a bit of an increase, but nothing much. 

With just the data points from Last.fm, there aren’t really any notable patterns. But number of tracks listened to on Spotify, SoundCloud, YouTube, or iTunes provides an incomplete perspective of my listening habits. If I expand the data I’m analyzing to include other types of listening—concerts attended and livestreams watched—and change the data point that I’m analyzing to the amount of time that I spend listening, instead of the number of tracks that I’ve listened to, it gets a bit more interesting. 

Chart shows roughly 12000 minutes spent listening in 2017, 10000 in 2018, 12000 in 2019, and 22000 in 2020While the number of tracks I listened to from 2019 to 2020 increased only 25%, the amount of time I spent listening to music increased by 74%, a full 150 hours more than the previous year during this time period. And May isn’t even over yet! 

It’s worth briefly noting that I’m estimating, rather than directly calculating, the amount of time spent listening to music tracks and attending live music events. To make this calculation, I’m using an estimate of 3 hours for each concert attended, 4 hours for each DJ set attended, 8 hours for each festival attended, and an estimate of 4 minutes for each track listened to, based on the average of all the tracks I’ve purchased over the past two years. Livestreamed sets are easier to track, but some of those are estimates as well because I didn’t start keeping track until the end of April.

I spent an extra 150 hours listening to music this year during this time—but when was I spending this time listening? If I break down the amount of time I spent listening by weekend compared with weekdays, it’s obvious:

Chart depicts 10000 weekday minutes and 5000 weekend minutes spent listening in 2017, 9500 weekday minutes and 4500 weekend minutes in 2018, 14000 weekday minutes and 2000 weekend minutes in 2019, and 12000 weekday minutes and 13000 weekend minutes in 2020

Before shelter-in-place, I’d spend most of my weekends outside, hanging out with friends, or attending concerts, DJ sets, and the occasional day party. Now that I’m spending my weekends largely inside and at home, coupled with the number of livestreaming festivals, I’m spending much more of that time listening to music. 

I was curious if perhaps working from home might reveal new weekday listening habits too, but the pattern remains fairly consistent. I also haven’t worked from home for an extended period before, so I don’t have a baseline to compare it with. 

It’s clear that weekends are when I’m doing most of my new listening, and that this new listening likely isn’t coming from my traditional listening habits. If I split the amount of time that I spend listening to music by the type of listening that I’m doing, the source of the added time spent listening is clear.

Depicts 11000 minutes of track listens and 1000 minutes of time spent at concerts in 2017, 8000 minutes spent listening to music tracks and 2000 minutes spent at concerts in 2018, 10000 minutes spent listening to music tracks and 3000 minutes spent at concerts in 2019, and 12000 minutes spent listening to music tracks and 9000 minutes listening to livestreams, with a sliver of 120 minutes spent at a single concert in 2020

Hello, livestreams. If you look closely you can also spy the sliver of a concert that I attended on March 7th.

Livestreams dominate, and so does Shazam

All of the livestreams I’ve been watching have primarily been DJ sets. Ordinarily, when I’m at a DJ set, I spend a good amount of time Shazamming the tracks I’m hearing. I want to identify the tracks that I’m enjoying so much on the dancefloor so I can track them down, buy them, and dig into the back catalog of those artists. 

So I requested my Shazam data to see what’s happening now that I’m home, with unlimited, shameless, and convenient access to Shazam.

For the time period that I have Shazam data for, the correlation of Shazam activity to number of livestreams watched is fairly consistent at roughly 10 successful Shazams per livestream.  

Chart details largely duplicated in surrounding text, but of note is a spike of 6 livestreams with only 30 or so songs shazammed, while the next few weeks show a fairly tight interlock of shazam activity with number of livestreams

Given the correlation of Shazam data, as well as the continued focus on watching DJ sets, I wanted to explore my artist discovery statistics as well. Especially when it seemed like my listening activity hadn’t shifted much, I was betting that my artist discovery statistics have been increasing during this time. If I look at just the past few years, there seems to be a direct increase during this time period. 

Chart depicts 260ish artists discovered in March, April, and May of 2018, 280 discovered in 2019, and 360 discovered in 2020Chart depicts 260ish artists discovered in March, April, and May of 2018, 280 discovered in 2019, and 360 discovered in 2020. Second chart shows the same data but adds 2017, with 390 artists discovered

However, after I add 2017 into the list as well, the pattern doesn’t seem like much of a pattern at all. Perhaps by the end of May, there will be a correlation or an outsized increase. But at least for now, the added number of livestreams I’ve been watching don’t seem to be producing an equivalently high number of artist discoveries, even though they’re elevated compared with the last two years. 

That could also be that the artists I’m discovering in the livestreams haven’t yet had a substantial effect on my non-livestream listening patterns, even if there’s 91 hours of music (and counting) in my quarandjed playlist where I store the tracks that catch my ear in a quarantine DJ set. Adding music to a playlist, of course, is not the same thing as listening to it. 

Livestreaming as concert replacement?

Shelter-in-place brought with it a slew of event cancellations and postponements. My live events calendar was severely affected. As of now, 15 concerts were affected in the following ways:

Chart depicts 6 concerts cancelled and 9 postponed

The amount of time that I spend at concerts compared with watching livestreams is also starkly different.

Chart depicts 1000 minutes spent at concerts in 2017, 2000 minutes at concerts in 2018, 2500 minutes at concerts in 2019, and 8000 minutes spent watching livestreams, with a topper of 120 minutes at a concert in 2020

I’ve spent 151 hours (and counting) watching livestreams, the rough equivalent of 50 concerts—my entire concert attendance of last year. This is almost certainly because I’m often listening to livestreams, rather than watching them happen.

Concerts require dedication—a period of time where you can’t really do anything else, a monetary investment, and travel to and from the show. Livestreams don’t have any of that, save a voluntary donation. That makes it easier to turn on a stream while I’m doing other things. While listening to a livestream, I often avoid engaging with the streaming experience. Unless the chat is a cozy few hundred folks at most, it’s a tire fire of trolls and not a pleasant experience. That, coupled with the fact that sitting on my couch watching a screen is inherently less engaging than standing in a club with music and people surrounding me, means that I’m often multitasking while livestreams are happening.

The attraction for me is that these streams are live, and they’re an event to tune into, and if you don’t, you might miss it. Because it’s live, you have the opportunity to create a shared collective experience. The chatrooms that accompany live video streams on YouTube, Twitch, and especially with Facebook’s Watch Party feature for Facebook Live videos, are what foster this shared experience. For me, it’s about that experience, so much so that I started a chat thread for Jamie xx’s 2020 Essential Mix so that my friends and I could experience and react to the set live. This personal experience is contrary to the conclusion drawn in this article on Hypebot called Our Music Consumption Habits Are Changing, But Will They Remain That Way? by Bobby Owsinski: “Given the choice, people would rather watch something than just listen.”. Given the choice, I’d rather have a shared collective experience with music rather than just sit alone on my couch and listen to it. 

Of course, with shelter-in-place, I haven’t been given a choice between attending concerts and watching livestreamed shows. It’s clear that without a choice, I’ll take whatever approximation of live music I can find.

 

What it takes to get to a concert

Ticket buying in the modern era is pretty brutal. You find out your favorite artist is coming to town, and with any luck, you discover this before the tickets go on sale. Then you start planning to get tickets. Set up a calendar reminder with a link to the site, then you get ready. If there are presales, you ask friends or you check emails — if you’re a dedicated concertgoer, you probably get emails from the promoters, venues, and maybe even your favorite artists’ fan clubs — tracking down the codes.

Then you get ready, mouse pointer cued up at 9:59, waiting until tickets go on sale. The time flips, it’s 10:00 AM and you click! Prepared to quickly select 2, best available (or GA floor, because who wants a balcony seat), and add to cart. But wait! You see the dreaded message. You’re in a queue. Now all you can do is desperately stare at the webpage, hoping nothing changes. What if a browser extension interferes? What if your browser freezes up? Finally, you’re out of the queue. You go to select your tickets, but wait. GA is all gone. All that’s left is the seated Loge. For a band that you dance to. Or worse, it’s already sold out. All that time, all that anxiety, all that preparation, only to get shut down. 

And that’s just the presale. You’ll do the whole thing over again at the next presale, or during the general onsale, hoping that the artist and the venue were strategic enough to set some tickets aside for each sale. If it comes down to it, you might have to show up to the venue an hour early (or more) before the show starts to get one of the limited tickets available at the door. 

That’s everything a dedicated concertgoer goes through to get concert tickets. Thing is, according to Kaitlyn Tiffany in The Atlantic, that’s also what modern ticket scalpers do.

This week was a brutal one for ticket sales for me and my friends. A show at a 2000+ capacity venue sold out within a few minutes during the presale, and a second show added later also sold out within minutes. The Format announced their first live dates in years, playing 2 shows in NYC and Chicago both, and 1 show in Phoenix. The presale tickets for all the shows sold out within a minute, or in the case of Phoenix, was plagued by ticket website issues but still managed to sell out by the end of the day. By the time the general ticket sales happened, they’d announced an additional show in each city. The general ticket sales also sold out within minutes, and Phoenix ended up with a third show before the day was up. 

How does it happen? And why do we put ourselves through this?! 

It’s important to note that buying concert tickets at all is a privilege. Some people (like me) make it a lifestyle to go to concerts and DJ sets. Others save their money and spend big to get great seats to see favorite artists in arena shows. But it takes money, time, and a bit of luck (or planning) to get tickets and get to a show. 

Whether or not you manage to get tickets to a show depends on several factors: 

  • Did you hear about the show before the tickets went on sale?
  • Did you have enough money at the time tickets went on sale (and in general) to afford the tickets?
  • Is your work schedule stable enough to know that you can go to the show if you buy tickets immediately when they go on sale?

If any one of these factors doesn’t work out, then you don’t have tickets to the show. Whether or not you get the opportunity to see an artist perform in concert at all is up to a whole other set of factors, subject to the careful strategies of the music industry combined with the artistic whims of the performers. 

If an artist doesn’t have a big enough fanbase in your city, and if it isn’t geographically convenient with available music venues, the artist probably won’t stop in your city. Even if they stop, the venue size can play a crucial role in whether or not you’ll get tickets to the show—will they be available, and will you even want them? 

Artists, especially after they’ve “gotten big”, can crave smaller, more intimate shows. But those are the shows that tend to sell out in a minute—especially if the fanbase in a certain city is larger than anticipated or if the artist is only playing a limited number of shows and end up drawing people from out of the ordinary reach of a venue.

Other times, artists can analyze the size of their fanbase in a city and then choose a venue—without considering if the venue size is appropriate for their type of music. Bon Iver toured 20,000+ seat arenas on their last tour, while they’re famous for their intimate music and have videos on YouTube with hundreds of thousands of views of Justin Vernon playing to just 1 fan. Even if an artist’s fanbase is large enough to fill an arena, the fans still might not want to buy tickets to see them in an arena. 

Beyond those considerations, artists can’t always play the venues they want to play due to promoter restrictions or other industry partnerships, sometimes leading to uncharacteristic bookings at oddly-sized or oddly-shaped venues: DJs playing a concert hall, rock bands in a semi-seated venue, or possibly even skipping a city entirely. 

The venue an artist chooses (or is forced to choose) can be a key factor when you’re deciding if you want to get tickets. But the artist (and their tour manager, and others) have still more to do before this concert happens. 

The ticket prices have to be set. Surely venues and promoters have set costs and prices that end up as effective ticket minimums for many shows, but artists certainly have a level of influence as well. Especially high-profile artists like Taylor Swift have chosen on past tours to make affordable tickets available to their fans.

And therein lies the rub: artists can price competitively, or highly, knowing they can charge a certain price and still sell out their show (or nearly sell it out). But they can also price affordably, hoping that legitimate fans will be able to snap up tickets when they go on sale, rather than delaying their purchase and being forced to buy from scalpers. 

OK so we’re still trying to buy these concert tickets. You’ve heard about the show, the artist has booked the venue and priced their tickets, you’ve got the money, you’ve got the time, you are ready at 10am on a Friday (or a Wednesday or a Thursday for those sweet sweet presale tickets). Where are you buying your tickets?

Ticket sites range from the homegrown (see: Bottom of the Hill), new kids on the block (Big Neon, Tixr), the budding behemoths (Eventbrite, AXS, Etix) and the (despised) old guard (TicketWeb/Ticketmaster/LiveNation). If you’re rushing to buy online tickets, you also need to prepare for the site experience. 

If it isn’t a site you’ve used before, you might want to consider if it requires an account to buy tickets. If it does, you have to make one and make sure you’re signed in before you try to buy the tickets. You also want to consider if the show big enough that you’ll end up in a queue to buy the tickets, and if the site is reliable enough to handle the load of a lot of people trying to buy tickets without crashing or throwing an error. 

Beyond site reliability, you have to consider your personal threshold for every ticket-buyer’s worst nightmare: fees. Almost every ticket purchase includes fees. How high do the fees need to be before you abandon your ticket purchase entirely? 

You also have to consider if there will be fees added to the face value of the ticket, and how high are too high of fees before you abandon the ticket purchase entirely. Of course, the irony of paying ticket fees is that most fans (myself included) dislike paying them because for so long the fees are hidden—last minute additions to your total, spiking the cost of $35 tickets to $60 at times. But it can be argued that transparently-disclosed fees are acceptable, and even necessary to provide a resilient, secure, reliable ticketing site—as well as to pay the promoters working hard to make sure your favorite band actually stops in your city.

Artists, promoters, venues, and ticketing sites do a lot to try to prevent ticket scalpers from bombing the market and selling out a show in minutes only to relist the tickets minutes later at unbelievable prices. Innovations in ticket technology, new marketplaces, and just plain making it harder to get tickets:

What makes a ticket purchaser legitimate? Probably some degree of purchasing tickets in a specific geographic region and in clusters of genres, likely combined with some fraud analysis. Then I wonder how suspicious my own ticket purchasing habits must look to the algorithms at times. As long as we’re attempting to define what a legitimate ticket purchaser looks like, we can consider who deserves the presale codes for shows.

There’s a notion that only “real fans” deserve first access to presale codes and tickets. But how do you verify and validate true fans? You could use specific digital consumption patterns, such as those that are probably used to give out Spotify presale codes, but those are limited to only those listening habits that are directly observable in digital data. Artists want people to buy tickets to their shows—that’s why often, presale codes are straightforward to track down.

Most often, getting tickets to a show is a matter of knowing the right people at the right time that might have information you don’t have. Songkick is there to fill in the gaps, alongside emails and texts from promoters and venues. But ultimately, nothing beats having a community of fans. And that was the thing that fascinated me about the article in The Atlantic about the modern ticket scalpers. Me and my friends, we use many of the same tactics to buy tickets. It’s a privilege and a challenge to get the tickets we want, but we love going to concerts. And often, it feels like it’s the only way these days we can help artists make money. 

Streaming, the cloud, and music interactions: are libraries a thing of the past?

Several years ago I wrote about fragmented music libraries and music discovery. In light of the overwhelming popularity of Spotify and the dominance of streaming music (Spotify, Apple Music, Amazon Music, Tidal, and others), I’m curious if music libraries even exist anymore. Or, if they exist today, will they continue to exist? 

My guess is that the only people still maintaining music libraries are DJs, fervent music fans (like myself), or people that aren’t using streaming music at all (due to age, lack of interest, or lack of availability due to markets or internet speeds). 

I was chatting with a friend of mine that has a collection of vinyl records, but she only ever listens to vinyl if she’s relaxing on the weekend. Oftentimes she’s just asking Alexa to play some music, without much attention to where that music is coming from. With Amazon Music bundled into Amazon Prime for many members, people can be totally unaware that they’re using a streaming service at all. I’d hazard that this interaction pattern is true for most people, especially those that never enjoyed maintaining a music library but instead collected CDs and records because that was the only way to be able to listen to music at all. 

Even my own habits are changing, perhaps equally due to time constraints as due to current music technology services. I used to carefully curate playlists for sharing with others, listening in the car, mix CDs, and for radio shows. These days I make playlists for many of those same purposes on Spotify, but the songs in my “actual” music library (iTunes) aren’t categorized into playlists at all anymore, and I give the playlists I make on my iPhone random names like “Aaa yay” to make the playlists easier to find, rather than to describe the contents. 

I’m limited by storage size in terms of what I can add to my iPhone, just like I was with my iPod, but that shapes my experience of the music. Since I’m limited to a smaller catalogue, I’m able to sit with the music more and create more distinct memories. There are still songs that remind me of being in Berlin in 2011, limited to the songs that I added to my iPod before I left the United States because the internet I had access to in Germany was too slow to download new music and add it to my iPod. 

Nowadays, I am less motivated to carefully manage my iTunes library because it’s only on one device, whereas I can access my Spotify library across multiple devices. That’s the one I find myself carefully creating folders of playlists for, organizing and sorting tracks and playlists. A primary reason for the success of Spotify for my listening habits is the social and collaborative nature of it. It’s easy to share tracks with others, make a playlist for a DJ set that I went to to share with others, contribute to a weekly collaborative playlist with a community of fellow music-lovers, or to follow playlists created by artists and DJs I love. My local library can give me a lot, but it can’t give me that community interaction.

Indeed, in 2015 that’s something I identified as lacking. I felt that it was harder to feel part of a music culture, writing:

“It’s harder than it used to be to feel connected with music. It’s not a stream or a subculture one is tapped into anymore, because it’s so distributed on the web. There’s so much music, and it lives in so many different services, that the music culture has imploded a bit.”

I feel completely differently these days, thanks to a vibrant live music community in San Francisco. I loathe Facebook, but the groups that I’m a part of on that site enable me to feel connected to a greater music scene and community that supplement my connection to music and music discovery. Ironically, Facebook groups have also helped my music culture experience become more local. The music blogs that I used to be able to tap into are now largely defunct, or have multiple functions (the burning ear also running vinyl me please, or All Things Go also providing news and an annual festival in DC). Instead yet another way I discover new music is by paying attention to the artists and DJs that people in these Facebook groups are talking about and posting tracks and albums from. 

Despite the challenges of a local music library, I keep buying digital music partially because I made a promise to myself when I was younger that I’d do so when I could afford to, partially to support musicians and producers, and partially because I distrust that streaming services will stick around with all the music I might want to listen to. I’d rather “own” it, at least as best as I can when it’s a digital file that risks deletion and decomposition over time. 

Music discovery in the past was equal parts discovery and collection, with a hefty dose of listening after I collected new music.

A flowchart showing Discover -> Collect -> Listen in a triangle, with listen connecting back to discoverI’d do the following when discovering new music:

  • Writing down song lyrics while listening to the radio or while working my retail job, then later looking up the tracks to check out albums from the library to rip to my family computer.
  • Following music blogs like The Burning Ear, All Things Go, Earmilk, Stereogum, Line of Best Fit, then downloading what I liked best from their site from MediaFire or MegaUpload to save to my own library.
  • Trolling through illicit LiveJournal communities or invite-only torrent sites to download discographies for artists I already liked, or might like.

Over time, those music blogs shifted to using SoundCloud, the online communities and torrent sites shuttered, and I started listening to more music on streaming sites instead. The loop stopped going from discovery to collection and instead to discovery, like, and discovery again. 

Find a new track, listen, click the heart or the plus sign, and move on. Rarely do you remember to go back and listen to your fully-compiled list of saved tracks (or even if you do, trying to listen to the whole thing on shuffle will be limited by the web app, thanks SoundCloud). 

A flowchart showing a cycle from discover to like and back again using arrows.

This type of cycle is faster than the old cycle, and more focused on engagement with the service (rather than the music) and less on collecting and more on consuming. In some ways, downloading music was like this too. When I accidentally deleted my entire music library in 2012, the tatters of my library that I was able to recover from my iPod was a scant representation of my full collection, but included in that library was discographies that I would likely never listen to. Now that it’s been years, there have been a few occasions where I go back and discover that an artist I listen to now is in that graveyard of deleted songs, but even knowing that, I’m not sure I would’ve gotten to it any sooner. I was always collecting more than I was listening to. 

Streaming music lets me collect in the same way, but without the personal risk. It just makes me dependent on a third-party entity that permits me to access the tracks that they store for me. I end up with lists of liked tracks across multiple different services, none of which I fully control. These days my music discovery is now largely driven by 3 services: Spotify, Shazam, and Soundcloud. Spotify pushes algorithmic recommendations to me, Shazam enables me to discover what track the DJ is currently playing when I’m out at a DJ set, and Soundcloud lets me listen to recorded DJ sets as well as having excellent autoplay recommendations. In all of them I have lists of tracks that I may never revisit after saving them. Some of them I’ll never be able to revisit, because they’ve been deleted or the service has lost the rights to the track. 

In 2015 I lamented the fragmentation of music discovery, but looking back, my music discovery was always shared across services, devices, and methods—the central iTunes library was what tied the radio songs, the library CDs, the discography downloads, and the music blog tracks together. The real issue is that the primary music discovery modes of today are service-dependent, and each of those services provides their own constructs of a music library. I mentioned in 2015 that:

“my library is all over the place. iTunes is still the main home of my music—I can afford to buy new music when I want —but I frequent Spotify and SoundCloud to check out new music. I sync my iTunes library to Google Play Music too, so I can listen to it at work.” 

While this is still largely true, I largely consume Spotify when I’m at work, listen to SoundCloud sets or tracks from iTunes when I’m on-the-go with my phone, and listen to Spotify or iTunes when I’m on my personal laptop. That’s essentially 2.5 places that I keep a music library, and while I maintain a purchase pipeline of tracks from Spotify and SoundCloud into my iTunes library, it’s a fraction of my discoveries that make it into my collection for the long term. The days of a true central collection of my library are long since past. 

It seems a feat, with all these digital cloud music services streaming music into our ears, to have a local music library. Indeed, what’s the point of holding onto your local files when it becomes so difficult to access it? iTunes is becoming the Apple Music app, with the Apple Music streaming service front and center. Spotify is, well, Spotify. And SoundCloud continues to flounder yet provides an essential service of underground music and DJ sets. Google Play Music exists, but only has a web-based player (no client) to make it easier to access and listen to your local library after you’ve mirrored it to the cloud. Streaming is convenient. But streaming music lets others own your content for you, granting you subscription access to it at best, ruining the quality of your music listening experience at worst. 

A recent essay by Dave Holmes in Esquire talks about “The Deleted Years”, or the years that we stored music on iPods, but since Spotify and other streaming services, have largely moved on from. As he puts it, 

“From 2003 to 2012, music was disposable and nothing survived.”

Perhaps it’s more true that from 2012 onward, music is omnipresent and yet more disposable. It can disappear into the void of a streaming service, and we’ll never even know we saved it. At least an abandoned iPod gives us a tangible record of our past habits. 

As Vicki Boykis wrote about SoundCloud in 2017

“I’m worried that, for internet music culture, what’s coming is the loss of a place that offered innumerable avenues for creativity, for enjoyment, for discovery of music that couldn’t and wouldn’t be created anywhere else. And, like everyone who has ever invested enough emotion in an online space long enough to make it their own, I’m wondering what’s next.”

I’ll be here, discovering, collecting, liking, and listening for what’s next.

Music streaming and sovereignty

As the music industry moves away from downloads and toward building streaming platforms, international sovereignty becomes more of a barrier to people listening to music and discussing it with others, because they don’t have access to the same music on the same platforms. As Sean Michaels points out in The Morning News several years ago:

one of the undocumented glitches in the current internet is all its asymmetrical licensing rules. I can’t use Spotify in Canada (yet). Whenever I’m able to, there’s no guarantee that Spotify Canada’s music library will match Spotify America’s. Just as Netflix Canada is different than Netflix US, and YouTube won’t let me see Jon Stewart. As we move away from downloads and toward streaming, international sovereignty is going to become more and more of a barrier to common discussions of music.

Location has always been a challenge to music access, but it’s important to keep in mind that the internet and music streaming has not been an equitable boon to music access—it is still controlled.

Planning and analyzing my concert attendance with Splunk

This past year I added some additional datasets to the Splunk environment I use to analyze my music: information about tickets that I’ve purchased, and information about upcoming concerts.

Ticket purchase analysis

I started keeping track of the tickets that I’ve purchased over the years, which gave me good insights about ticket fees associated with specific ticket sites and concert promoters.  

Based on the data that I’ve accumulated so far, Ticketmaster doesn’t have the highest fees for concert tickets. Instead, Live Nation does. This distinction is relatively meaningless when you realize they’ve been the same company since 2010.

However, the ticket site isn’t the strongest indicator of fees, so I decided to split the data further by promoter to identify if specific promoters had higher fees than others.

Based on that data you can see that the one show I went to promoted by AT&T had fee percentages of nearly 37%, and that shows promoted by Live Nation (through their evolution and purchase by Ticketmaster) also had fees around 26%. Shows promoted by independent venues have somewhat higher fees than others, hovering around 25% for 1015 Folsom and Mezzanine, but shows promoted by organizations whose only purpose is promotion tend to have slightly lower fees, such as select entertainment with 18%, Popscene with 16.67%, and KC Turner Presents with 15.57%.

I realized I might want to refine this, so I recalculated this data, limiting it to promoters from which I’ve bought at least two tickets.

It’s a much more even spread in this case, ranging from 25% to 11% in fees. However, you can see that the same patterns exist— for the shows I’ve bought tickets to, the independent venues average 22-25% in fees, while dedicated independent promoters are 16% or less in added fees, with corporate promoters like Another Planet, JAM, and Goldenvoice filling the middle of the data ranging from 18% to 22%.

I also attempted to determine how I’m discovering concerts. This data is entirely reliant on my memory, with no other data to back it up, but it’s pretty fascinating to track.

It’s clear that Songkick has become a vital service in my concert-going planning, helping me discover 46 shows, and friends and email newsletters from venues helping me stay in the know as well for 19 and 14 shows respectively. Social media contributes as well, with a Facebook community (raptors) and Instagram making appearances with 10 and 2 discoveries respectively.

Concert data from Songkick

Because Songkick is so vital to my concert discovery, I wanted to amplify the information I get from the service. In addition to tracking artists on the site, I wanted to proactively gather information about artists coming to the SF Bay Area and compare that with my listening habits. To do this, I wrote a Songkick alert action in Python to run in Splunk.

Songkick does an excellent job for the artists that I’m already tracking, but there are some artists that I might have just recently discovered but am not yet tracking. To reduce the likelihood of missing fast-approaching concerts for these newly-discovered artists, I set up an alert to look for concerts for artists that I’ve discovered this year and have listened to at least 5 times.

To make sure I’m also catching other artists I care about, I use another alert to call the Songkick API for every artist that is above a calculated threshold. That threshold is based on the average listens for all artists that I’ve seen live, so this search helps me catch approaching concerts for my historical favorite artists.

Also to be honest, I also did this largely so that I could learn how to write an alert action in Splunk software. Alert actions are essentially bits of custom python code that you can dispatch with the results of a search in Splunk. The two alert examples I gave are both saved searches that run every day and update an index. I built a dashboard to visualize the results.

I wanted to use log data to confirm which artists were being sent to Songkick with my API request, even if no events were returned. To do this I added a logging statement in my Python code for the alert action, and then visualized the log statements (with the help of a lookup to match the artist_mbid with the artist name) to display the artists that had no upcoming concerts at all, or had no SF concerts.

For those artists without concerts in the San Francisco Bay Area, I wanted to know where they were going instead, so that I could identify possible travel locations for the future.

It seems like Paris is the place to be for several of these artists—there might be a festival that LAUER, Max Cooper, George Fitzgerald, and Gerald Toto are all playing at, or they just happen to all be visiting that city on their tours.

I’m planning to publish a more detailed blog post about the alert action code in the future on the Splunk blogs site, but until then I’ll be off looking up concert tickets to these upcoming shows….

Making Concert Decisions with Splunk

The annual Noise Pop music festival starts this week, and I purchased a badge this year, which means I get to go to any show that’s a part of the festival without buying a dedicated ticket.

That means I have a lot of choices to make this week! I decided to use data to assess (and validate) some of the harder choices I needed to make, so I built a dashboard, “Who Should I See?” to help me out.

First off, the Wednesday night show. Albert Hammond, Jr. of the Strokes is playing, but more people are talking about the Baths show the same night. Maybe I should go see Baths instead?

Screen capture showing two inputs, one with Baths and one with Albert Hammond, Jr, resulting in count of listens compared for each artist (6 vs 39) and listens over time for each artist. Baths has 1 listen before 2012, and 1 listen each year for 2016 until this year. Albert Hammond, Jr has 8 listens before 2010, and a consistent yet reducing number over time, with 5 in 2011 and 4 in 2015, but just a couple since then.

If I’m making my decisions purely based on listen count, it’s clear that I’m making the right choice to see Albert Hammond, Jr. It is telling, though, that I’ve listened to Baths more recently than him, which might have contributed to my indecision.

The other night I’m having a tough time deciding about is Saturday night. Beirut is playing, but across the Bay in Oakland. Two other interesting artists are playing closer to home, Bob Mould and River Whyless. I wouldn’t normally care about this so much, but I know my Friday night shows will keep me busy and leave me pretty tired. So which artist should I go see?

3 inputs on a dashboard this time, Beirut, Bob Mould, and River Whyless are the three artists being compared. Beirut has 44 listens, Bob Mould has 21, River Whyless has 3. Beirut has frequent listens over time, peaking at 6 before 2010, but with peaks at 5 in 2011 and 2019. Bob Mould has 6 listens pre-2009, but only 3 in 2010 and after that, 1 a year at most. River Whyless has 1 listen in April, and 2 in December of 2018.

It’s pretty clear that I’m making the right choice to go see Beirut, especially given my recent renewed interest thanks to their new album.

I also wanted to be able to consider if I should see a band at all! This isn’t as relevant this week thanks to the Noise Pop badge, but it currently evaluates if the number of listens I have for an artist exceeds the threshold that I calculate based on the total number of listens for all artists that I’ve seen live in concert. To do this, I’m evaluating whether or not an artist has more listens than the threshold. If they do, I return advice to “Go to the concert!” but if they don’t, I recommend “Only if it’s cheap, yo.”

Because I don’t need to make this decision for Noise Pop artists, I picked a few that I’ve been wanting to see lately: Lane 8, Luttrell, and The Rapture.

4 dashboard panels, 3 of which ask "Should I go see (artist) at all?" one for each artist, Lane 8, Luttrell, and The Rapture. Lane 8 and Luttrell both say "Only go if it's cheap, yo." and The Rapture says "Go to the concert!". The fourth panel shows frequent listening for The Rapture, especially from 2008-2012, with a recent peak in 2018. Lane 8 spikes at the end of the graph, and Luttrell is a small blip at the end of the graph.

While my interest in Lane 8 has spiked recently, there still aren’t enough cumulative listens to put them over the threshold. Same for Luttrell. However, The Rapture has enough to put me over the threshold (likely due to the fact that I’ve been listening to them for over 10 years), so I should go to the concert! I’m going to see The Rapture in May, so I am gleefully obeying my eval statement!

On a more digressive note, it’s clear to me that this evaluation needs some refinement to actually reflect my true concert-going sentiments. Currently, the threshold averages all the listens for all artists that I’ve seen live. It doesn’t restrict that average to consider only the listens that occur before seeing an artist live, which might make it more accurate. That calculation would also be fairly complex, given that it would need to account for artists that I’ve seen multiple times.

However, number of listens over time doesn’t alone reflect interest in going to a concert. It might be useful to also consider time spent listening, beyond count of listens for an artist. This is especially relevant when considering electronic music, or DJ sets, because I might only have 4 listen counts for an artist, but if that comprises 8 hours of DJ sets by that artist that I’ve listened to, that is a pretty strong signal that I would likely enjoy seeing that artist perform live.

I thought that I’d need to get direct access to the MusicBrainz database in order to get metadata like that, but it turns out that the Last.fm API makes some available through their track.getInfo endpoint, so I just found a new project! In the meantime I am able to at least calculate duration for tracks that exist in my iTunes library.

I now have a new avenue to explore with this project, collecting that data and refining this calculation. Reach out on Twitter to let me know what you might consider adding to this calculation to craft a data-driven concert-going decision-making dashboard.

If you’re interested in this app, it is open sourced and available on Splunkbase. I’ll commit the new dashboard to the app repo soon!

My 2018 Year in Music: Data Analysis and Insights

This past year has been pretty eventful in music for me. I’ve attended a couple new festivals, seen shows while traveling, and discovered plenty of new bands. I want to examine the data available to me and contrast it with my memories of the past year.

I’ve been using Splunk to analyze my music data for the past couple years. You can learn more about what I’ve learned from that in the past in my other posts, see Reflecting on a Decade of Quantified Music Listening and Best of 2017: Newly-Discovered Music. I also wrote a blog post for the Splunk blog (I work there) about this too: 10 Years of Listens: Analyzing My Music Data with Splunk.

Comparing Spotify’s Data with Mine

Spotify released its #2018wrapped campaign recently, sharing highlights from the year of my listening data with me (and in an ad campaign, aggregate data from all the users). As someone that uses Spotify but not as my exclusive source of music listening, I was curious to compare the results with my holistic dataset that I’ve compiled in Splunk. 

Top Artists are Poolside, The Blaze, Justice, Born Ruffians, and Bob Moses. Top Songs are Beautiful Rain, For the Birds, Miss You, Faces, and Heaven. I listened for 30.473 minutes, and my top genre was Indie.

Spotify’s top artists for me were somewhat different from the results that I found from the data I gather from Last.fm and analyze with Splunk software.  Spotify and my holistic listening data agree that I listened to Poolside more than anyone else, and was also a big fan of Born Ruffians, but beyond that they differ. This is probably due to the fact that I bought music and when I’m mobile I switch my primary listening out of Spotify to song files stored on my phone. 

Table showing my top artists and their listens, Poolside with 162 listens, The Vaccines with 136, Young Fathers with 124, Born Ruffians with 102 and Mumford and Sons with 99 listens.

In addition, my top 5 songs of the year were completely different from those listed in Spotify. My holistic top 5 songs of the year were all songs that I purchased. I don’t listen to music exclusively in Spotify, and my favorites go beyond what the service can recognize.

Table showing top songs and the corresponding artist and listen count for the song. Border Girl by Young Fathers with 35 was first, followed by Era by Hubert Kirchner with 32, Naive by the xx with 29, Sun (Viceroy Remix) by Two Door Cinema Club with 27 and There Will Be Time by Mumford & Sons with Baaba Maal also with 27 listens.

Spotify identified that I’ve listened to 30,473 minutes of music, but I can’t make a similarly reliable calculation with my existing data because I don’t have track length data for all the music that I’ve listened to. I can calculate the number of track listens so far this year, and based on that, make an approximation based on the track length data that I do have from my iTunes library. The minute calculation I can make indicates that I’ve so far spent 21,577 minutes listening to 3,878 of the 10,301 total listens I’ve accumulated so far this year (Numbers to change literally as this post is being written).

Screen capture showing total listens of 10,301 and total minutes listened to itunes library songs as 21,577 minutes.

I’m similarly lacking data allowing me to determine my top genre of the year, but Indie is a pretty reliable genre for my taste. 

Other Insights from 2018

I was able to calculate my Top 10 artists, songs, and albums of the year, and drill down on the top 10 artists to see additional data about them (if it existed) in my iTunes library, like other tracks, the date it was added, as well as the kind of file (helping me identify if it was purchased or not), and the length of the track.

Screen capture displaying top 10 artists, top 10 songs, top 10 albums of the year, with the artist Hubert Kirchner selected in the top 10 song list, with additional metadata about songs by Hubert Kirchner listed in a table below the top 10 lists, showing 3 songs by Hubert Kirchner along with the album, genre, rating, date_added, Kind, and track_length for the songs. Other highlights described in text.

There are quite a few common threads across the top 10 artists, songs, and albums, with Poolside, Young Fathers, Gilligan Moss, The Vaccines, and Justice making consistent appearances. The top 10 songs display obsessions with particular songs that outweigh an aggregate popularity for the entire album, leading other songs to be the top albums of the year.

Interestingly, the Polo & Pan album makes my top 10 albums while they don’t make it to my top 10 artist or song lists. This is also true for the album Dancehall by The Blaze. I’m not much of an album listener usually, but I know I listened to those albums several times.

The top 10 song list is more dominated by specific songs that caught my attention, and the top 10 artists neatly reflect both lists. The artists that have a bit more of a back catalog also reveal themselves, given that Born Ruffians managed to crack the top 10 despite not having any songs or albums make the top 10 lists, and Hey Rosetta! makes the top artist and album lists, despite having no top songs.

Screen capture that says Songs Purchased in 2018. 285 songs.

I purchased 285 songs this year, an increase of 157 compared to the year before. I think I just bought songs more quickly after first hearing them this year, and there are even some songs missing from this list that I bought on Beatport or Bandcamp because they weren’t available in the iTunes Store. While I caved in to Spotify premium this year, I still kept up an old promise to myself to buy music (rather than acquire it without paying for it, from a library or questionable download mechanisms) now that I can afford it. 

A Year of Concerts

Screen capture of 4 single value data points, followed by 2 bar charts. Single value data points are total spent on concerts attended in 2018 ($1835.04), total concerts in 2018 (48), artists seen in concert in 2018 (116 artists), and total spent on concert tickets in 2018 ($2109). The first bar chart shows the number of concerts attended per month, 2 in January, 3 in February, 2 in March, 6 in April, 4 in May, 2 in June, 3 in July, 8 in August, 4 in September, 6 in October, 5 in November, and 3 so far in December. The last bar chart is the number of artists seen by month: 5 in Jan, 10 in Feb, 3 in March, 14 in April, 8 in May, 3 in June, 8 in July, 18 in August, 9 in Sep, 22 in Oct, 10 in Nov, 6 in December.

I’ve been to a lot of concerts so far this year. 48, to be exact. I spent a lot of money on concert tickets, both for the shows I attended this year and for shows that went on sale during 2018 (but at this point, might be happening in 2019). I often will buy tickets for multiple people, so this number isn’t very precise for my own personal ticket usage.

I managed to go to at least 2 concerts every month. By the time the year is over, I’m on track to go to 51 different shows. Based on the statistics, there are some months where I went to many more than 1 show per week, and others where I didn’t. Especially apparent are the months with festivals—February, August, and October all included festivals that I attended. 

Many of those festivals brought me to new-to-me locations, with the Noise Pop Block Party and Golden Gate Park giving me new perspectives on familiar places, and Lollapalooza after shows bringing me out to Schubas Tavern for the first time in Chicago.  

Screen capture listing venues visited for the first time in 2018, with venue, city, state, and date listed. Notable ones mentioned in text, full list of venue names: Audio, The New Parish, San Francisco Belle, Schubas Tavern, Golden Gate Park, August Hall, Noise Pop Block Party, Bergerac, Great American Music Hall, Cafe du Nord, Swedish American Hall.

If you’re reading this wondering what San Francisco Belle is, it’s a boat. That’s one of several new venues that electronic music brought me to—DJ sets on that boat as part of Goldroom and Gigamesh’s tour, plus a day party in Bergerac and a nighttime set at Audio other times throughout the year.

Some of those new venue locations brought newly-discovered music to me as well.

Screen capture showing top 20 artists discovered in 2018, sorted by count of listens, featuring a sparkline to show how frequently I listened to the artist throughout the year, and a first_discovered date. List: Gilligan Moss, The Blaze, Polo & Pan, Hubert Kirchner, Keita Sano, Jude Woodhead, Ben Böhmer, Karizma, Luxxury, SuperParka, Chris Malinchak, Mumford & Sons and Baaba Maal, Jon Hopkins, Yon Yonson,  Brandyn Burnette and dwilly, Asgeir, The Heritage Orchestra Jules Buckley and Pete Tong, Confidence Man, Bomba Estereo, and Jenn Champion.

The 20th-most-popular artist I discovered this year was Jenn Champion, who opened for We Were Promised Jetpacks at their show at the Great American Music Hall. I started writing this assuming that I hadn’t heard Jenn Champion before that night, but apparently I first discovered them on July 9, but the show wasn’t until October 9. 

As it turns out, I listened to what is now my favorite song by Jenn Champion that day in July, likely as part of a Spotify algorithm-driven playlist (judging by the listening neighbors around the same time) but it didn’t stick until I saw them play live months later. The vagaries of playlists that refresh once a week can mean fleeting discoveries that you don’t really absorb.

Screen capture showing Splunk search results of artist, track_name, and time from July 9th. Songs near Jenn Champion's song in time include Mcbaise - Le Paradis Du Cuir, Wolf Alice - Don't Delete the Kisses (Tourist Remix) and Champyons - Roaming in Paris.
Other songs I listened to that day in July

Because of how I can search for things in Splunk, I was also curious to see what others songs I heard when I first discovered Hubert Kirchner, a great house artist.

Songs listened to around the same time as I first heard Hubert Kirchner's song Era.... I listened to Dion's song Dream Lover, Deradoorian's song You Carry the Dead (Hidden Cat Remix) followed by Hubert Kirchner, then listened to Miguel's song Sure Thing, How to Dress Well with What You Wanted, then listen to Rihanna, Love on the Brain, Selena Gomez with Bad Liar, and Descendents with I'm the One. I have no idea how I got into this mix of songs.

I have really no idea what playlist I was listening to that might have led to me making jumps from Sofi Tukker, to Tanlines, to Dion, to Deradoorian, then to Hubert Kirchner, Miguel, How to Dress Well, Rihanna, Selena Gomez, and Descendents. Given that August 24th was a Friday, my best guess is perhaps that it was a Release Radar playlist, or perhaps an epic shuffle session. 

Repeat of earlier screen capture showing top 20 artists discovered in 2018. Sorted by count of listens, featuring a sparkline to show how frequently I listened to the artist throughout the year, and a first_discovered date. List: Gilligan Moss, The Blaze, Polo & Pan, Hubert Kirchner, Keita Sano, Jude Woodhead, Ben Böhmer, Karizma, Luxxury, SuperParka, Chris Malinchak, Mumford & Sons and Baaba Maal, Jon Hopkins, Yon Yonson,  Brandyn Burnette and dwilly, Asgeir, The Heritage Orchestra Jules Buckley and Pete Tong, Confidence Man, Bomba Estereo, and Jenn Champion

For the top 20 bands I discovered in 2018, many of them I started listening to on Spotify, but not necessarily because of Spotify. Gilligan Moss was a discovery from a collaborative playlist shared with those that are also in a Facebook group about concert-going. I later saw them at one of the festivals I went to this year, and it even turned out that a friend knew one of the band members! Their status as my most-listened-to discovery of this year is very accurate.

 Polo & Pan was a discovery from a friend, fully brought to life with a playlist built by Polo & Pan themselves and shared on Spotify. Spent some quality time sitting in a park listening to that playlist and just enjoying life. They were at the same festival as Gilligan Moss, playing the same day, making that day a standout of my concerts this year.

Karizma was a discovery from Jamie xx’s set at Outside Lands. I tracked down the song from the set with the help of several other people on the internet (not necessarily anyone I knew) and then the song that was from the set itself wasn’t even on Spotify itself (Spotify, however, did help me discover more of the artist’s back catalog, like my other favorite song ‘Nuffin Else) Apparently I was far behind the curve hearing the song from the set, since it came out in 2017 and was featured in a Chromebook ad, but Work It Out still made me lose my mind at that set. (For the record, so did Take Me Higher, a song I did not manage to track down at all, and have so much thanks for the person that messaged me on Facebook ages later to send me the link!)

Similarly, Luxxury was a DJ I first spotted on a cruise that I went on because it featured other DJs I had heard of from college, Goldroom and Gigamesh, whom I’d discovered through remixes of songs I downloaded from mp3 blogs like The Burning Ear.

~ Finding Meaning in the Platforms ~

Many of these discoveries were deepened by Spotify, or had Spotify as a vector—through a collaborative playlist, algorithmically-generated one, or the quick back-catalog access for a new artist—but don’t rely on Spotify as a platform. I prefer to keep my music listening habits platform-adjacent. 

Spotify, SoundCloud, iTunes, Beatport and other music platforms I use help make my music experiences possible. But the artists making the music, performing live in venues that I have the privilege to live near and afford to visit, they are creating what keep my mind alive and energized.

The social platforms too, mediate the music-related experiences I’ve had, whether it’s with the people I share music and concert experiences with in a Facebook group, the people I exchange tracks and banter with in Slack channels, or those of you reading this on yet another platform. 

I like to listen to music that moves me, physically, or that arrests my mind and takes me somewhere. More now than ever I realize that musical enjoyment for me is an intense instantiation of the continuous tension-and-release pattern that exists in so many human art forms. The waves of neatness that clash and collide in a house music track, or the soaring crescendos of harmonies. 

It’s become clear to me over the years that I can’t separate my enjoyment of music from the platforms that bring me closer to it. Perhaps supporting the platforms in addition to the musical artists, performers, and venues, is just another element of contributing to a thriving music scene.