My 2018 Year in Music: Data Analysis and Insights

This past year has been pretty eventful in music for me. I’ve attended a couple new festivals, seen shows while traveling, and discovered plenty of new bands. I want to examine the data available to me and contrast it with my memories of the past year.

I’ve been using Splunk to analyze my music data for the past couple years. You can learn more about what I’ve learned from that in the past in my other posts, see Reflecting on a Decade of Quantified Music Listening and Best of 2017: Newly-Discovered Music. I also wrote a blog post for the Splunk blog (I work there) about this too: 10 Years of Listens: Analyzing My Music Data with Splunk.

Comparing Spotify’s Data with Mine

Spotify released its #2018wrapped campaign recently, sharing highlights from the year of my listening data with me (and in an ad campaign, aggregate data from all the users). As someone that uses Spotify but not as my exclusive source of music listening, I was curious to compare the results with my holistic dataset that I’ve compiled in Splunk. 

Top Artists are Poolside, The Blaze, Justice, Born Ruffians, and Bob Moses. Top Songs are Beautiful Rain, For the Birds, Miss You, Faces, and Heaven. I listened for 30.473 minutes, and my top genre was Indie.

Spotify’s top artists for me were somewhat different from the results that I found from the data I gather from Last.fm and analyze with Splunk software.  Spotify and my holistic listening data agree that I listened to Poolside more than anyone else, and was also a big fan of Born Ruffians, but beyond that they differ. This is probably due to the fact that I bought music and when I’m mobile I switch my primary listening out of Spotify to song files stored on my phone. 

Table showing my top artists and their listens, Poolside with 162 listens, The Vaccines with 136, Young Fathers with 124, Born Ruffians with 102 and Mumford and Sons with 99 listens.

In addition, my top 5 songs of the year were completely different from those listed in Spotify. My holistic top 5 songs of the year were all songs that I purchased. I don’t listen to music exclusively in Spotify, and my favorites go beyond what the service can recognize.

Table showing top songs and the corresponding artist and listen count for the song. Border Girl by Young Fathers with 35 was first, followed by Era by Hubert Kirchner with 32, Naive by the xx with 29, Sun (Viceroy Remix) by Two Door Cinema Club with 27 and There Will Be Time by Mumford & Sons with Baaba Maal also with 27 listens.

Spotify identified that I’ve listened to 30,473 minutes of music, but I can’t make a similarly reliable calculation with my existing data because I don’t have track length data for all the music that I’ve listened to. I can calculate the number of track listens so far this year, and based on that, make an approximation based on the track length data that I do have from my iTunes library. The minute calculation I can make indicates that I’ve so far spent 21,577 minutes listening to 3,878 of the 10,301 total listens I’ve accumulated so far this year (Numbers to change literally as this post is being written).

Screen capture showing total listens of 10,301 and total minutes listened to itunes library songs as 21,577 minutes.

I’m similarly lacking data allowing me to determine my top genre of the year, but Indie is a pretty reliable genre for my taste. 

Other Insights from 2018

I was able to calculate my Top 10 artists, songs, and albums of the year, and drill down on the top 10 artists to see additional data about them (if it existed) in my iTunes library, like other tracks, the date it was added, as well as the kind of file (helping me identify if it was purchased or not), and the length of the track.

Screen capture displaying top 10 artists, top 10 songs, top 10 albums of the year, with the artist Hubert Kirchner selected in the top 10 song list, with additional metadata about songs by Hubert Kirchner listed in a table below the top 10 lists, showing 3 songs by Hubert Kirchner along with the album, genre, rating, date_added, Kind, and track_length for the songs. Other highlights described in text.

There are quite a few common threads across the top 10 artists, songs, and albums, with Poolside, Young Fathers, Gilligan Moss, The Vaccines, and Justice making consistent appearances. The top 10 songs display obsessions with particular songs that outweigh an aggregate popularity for the entire album, leading other songs to be the top albums of the year.

Interestingly, the Polo & Pan album makes my top 10 albums while they don’t make it to my top 10 artist or song lists. This is also true for the album Dancehall by The Blaze. I’m not much of an album listener usually, but I know I listened to those albums several times.

The top 10 song list is more dominated by specific songs that caught my attention, and the top 10 artists neatly reflect both lists. The artists that have a bit more of a back catalog also reveal themselves, given that Born Ruffians managed to crack the top 10 despite not having any songs or albums make the top 10 lists, and Hey Rosetta! makes the top artist and album lists, despite having no top songs.

Screen capture that says Songs Purchased in 2018. 285 songs.

I purchased 285 songs this year, an increase of 157 compared to the year before. I think I just bought songs more quickly after first hearing them this year, and there are even some songs missing from this list that I bought on Beatport or Bandcamp because they weren’t available in the iTunes Store. While I caved in to Spotify premium this year, I still kept up an old promise to myself to buy music (rather than acquire it without paying for it, from a library or questionable download mechanisms) now that I can afford it. 

A Year of Concerts

Screen capture of 4 single value data points, followed by 2 bar charts. Single value data points are total spent on concerts attended in 2018 ($1835.04), total concerts in 2018 (48), artists seen in concert in 2018 (116 artists), and total spent on concert tickets in 2018 ($2109). The first bar chart shows the number of concerts attended per month, 2 in January, 3 in February, 2 in March, 6 in April, 4 in May, 2 in June, 3 in July, 8 in August, 4 in September, 6 in October, 5 in November, and 3 so far in December. The last bar chart is the number of artists seen by month: 5 in Jan, 10 in Feb, 3 in March, 14 in April, 8 in May, 3 in June, 8 in July, 18 in August, 9 in Sep, 22 in Oct, 10 in Nov, 6 in December.

I’ve been to a lot of concerts so far this year. 48, to be exact. I spent a lot of money on concert tickets, both for the shows I attended this year and for shows that went on sale during 2018 (but at this point, might be happening in 2019). I often will buy tickets for multiple people, so this number isn’t very precise for my own personal ticket usage.

I managed to go to at least 2 concerts every month. By the time the year is over, I’m on track to go to 51 different shows. Based on the statistics, there are some months where I went to many more than 1 show per week, and others where I didn’t. Especially apparent are the months with festivals—February, August, and October all included festivals that I attended. 

Many of those festivals brought me to new-to-me locations, with the Noise Pop Block Party and Golden Gate Park giving me new perspectives on familiar places, and Lollapalooza after shows bringing me out to Schubas Tavern for the first time in Chicago.  

Screen capture listing venues visited for the first time in 2018, with venue, city, state, and date listed. Notable ones mentioned in text, full list of venue names: Audio, The New Parish, San Francisco Belle, Schubas Tavern, Golden Gate Park, August Hall, Noise Pop Block Party, Bergerac, Great American Music Hall, Cafe du Nord, Swedish American Hall.

If you’re reading this wondering what San Francisco Belle is, it’s a boat. That’s one of several new venues that electronic music brought me to—DJ sets on that boat as part of Goldroom and Gigamesh’s tour, plus a day party in Bergerac and a nighttime set at Audio other times throughout the year.

Some of those new venue locations brought newly-discovered music to me as well.

Screen capture showing top 20 artists discovered in 2018, sorted by count of listens, featuring a sparkline to show how frequently I listened to the artist throughout the year, and a first_discovered date. List: Gilligan Moss, The Blaze, Polo & Pan, Hubert Kirchner, Keita Sano, Jude Woodhead, Ben Böhmer, Karizma, Luxxury, SuperParka, Chris Malinchak, Mumford & Sons and Baaba Maal, Jon Hopkins, Yon Yonson,  Brandyn Burnette and dwilly, Asgeir, The Heritage Orchestra Jules Buckley and Pete Tong, Confidence Man, Bomba Estereo, and Jenn Champion.

The 20th-most-popular artist I discovered this year was Jenn Champion, who opened for We Were Promised Jetpacks at their show at the Great American Music Hall. I started writing this assuming that I hadn’t heard Jenn Champion before that night, but apparently I first discovered them on July 9, but the show wasn’t until October 9. 

As it turns out, I listened to what is now my favorite song by Jenn Champion that day in July, likely as part of a Spotify algorithm-driven playlist (judging by the listening neighbors around the same time) but it didn’t stick until I saw them play live months later. The vagaries of playlists that refresh once a week can mean fleeting discoveries that you don’t really absorb.

Screen capture showing Splunk search results of artist, track_name, and time from July 9th. Songs near Jenn Champion's song in time include Mcbaise - Le Paradis Du Cuir, Wolf Alice - Don't Delete the Kisses (Tourist Remix) and Champyons - Roaming in Paris.
Other songs I listened to that day in July

Because of how I can search for things in Splunk, I was also curious to see what others songs I heard when I first discovered Hubert Kirchner, a great house artist.

Songs listened to around the same time as I first heard Hubert Kirchner's song Era.... I listened to Dion's song Dream Lover, Deradoorian's song You Carry the Dead (Hidden Cat Remix) followed by Hubert Kirchner, then listened to Miguel's song Sure Thing, How to Dress Well with What You Wanted, then listen to Rihanna, Love on the Brain, Selena Gomez with Bad Liar, and Descendents with I'm the One. I have no idea how I got into this mix of songs.

I have really no idea what playlist I was listening to that might have led to me making jumps from Sofi Tukker, to Tanlines, to Dion, to Deradoorian, then to Hubert Kirchner, Miguel, How to Dress Well, Rihanna, Selena Gomez, and Descendents. Given that August 24th was a Friday, my best guess is perhaps that it was a Release Radar playlist, or perhaps an epic shuffle session. 

Repeat of earlier screen capture showing top 20 artists discovered in 2018. Sorted by count of listens, featuring a sparkline to show how frequently I listened to the artist throughout the year, and a first_discovered date. List: Gilligan Moss, The Blaze, Polo & Pan, Hubert Kirchner, Keita Sano, Jude Woodhead, Ben Böhmer, Karizma, Luxxury, SuperParka, Chris Malinchak, Mumford & Sons and Baaba Maal, Jon Hopkins, Yon Yonson,  Brandyn Burnette and dwilly, Asgeir, The Heritage Orchestra Jules Buckley and Pete Tong, Confidence Man, Bomba Estereo, and Jenn Champion

For the top 20 bands I discovered in 2018, many of them I started listening to on Spotify, but not necessarily because of Spotify. Gilligan Moss was a discovery from a collaborative playlist shared with those that are also in a Facebook group about concert-going. I later saw them at one of the festivals I went to this year, and it even turned out that a friend knew one of the band members! Their status as my most-listened-to discovery of this year is very accurate.

 Polo & Pan was a discovery from a friend, fully brought to life with a playlist built by Polo & Pan themselves and shared on Spotify. Spent some quality time sitting in a park listening to that playlist and just enjoying life. They were at the same festival as Gilligan Moss, playing the same day, making that day a standout of my concerts this year.

Karizma was a discovery from Jamie xx’s set at Outside Lands. I tracked down the song from the set with the help of several other people on the internet (not necessarily anyone I knew) and then the song that was from the set itself wasn’t even on Spotify itself (Spotify, however, did help me discover more of the artist’s back catalog, like my other favorite song ‘Nuffin Else) Apparently I was far behind the curve hearing the song from the set, since it came out in 2017 and was featured in a Chromebook ad, but Work It Out still made me lose my mind at that set. (For the record, so did Take Me Higher, a song I did not manage to track down at all, and have so much thanks for the person that messaged me on Facebook ages later to send me the link!)

Similarly, Luxxury was a DJ I first spotted on a cruise that I went on because it featured other DJs I had heard of from college, Goldroom and Gigamesh, whom I’d discovered through remixes of songs I downloaded from mp3 blogs like The Burning Ear.

~ Finding Meaning in the Platforms ~

Many of these discoveries were deepened by Spotify, or had Spotify as a vector—through a collaborative playlist, algorithmically-generated one, or the quick back-catalog access for a new artist—but don’t rely on Spotify as a platform. I prefer to keep my music listening habits platform-adjacent. 

Spotify, SoundCloud, iTunes, Beatport and other music platforms I use help make my music experiences possible. But the artists making the music, performing live in venues that I have the privilege to live near and afford to visit, they are creating what keep my mind alive and energized.

The social platforms too, mediate the music-related experiences I’ve had, whether it’s with the people I share music and concert experiences with in a Facebook group, the people I exchange tracks and banter with in Slack channels, or those of you reading this on yet another platform. 

I like to listen to music that moves me, physically, or that arrests my mind and takes me somewhere. More now than ever I realize that musical enjoyment for me is an intense instantiation of the continuous tension-and-release pattern that exists in so many human art forms. The waves of neatness that clash and collide in a house music track, or the soaring crescendos of harmonies. 

It’s become clear to me over the years that I can’t separate my enjoyment of music from the platforms that bring me closer to it. Perhaps supporting the platforms in addition to the musical artists, performers, and venues, is just another element of contributing to a thriving music scene.

Politeness in Virtual Assistant Design

The wave of chatbots and virtual assistants like Cortana, Siri, and Alexa means that we’re engaging in conversations with non-humans more than ever before. Problem is, those non-human conversations can turn inhuman when it comes to social norms.

Interactions with virtual assistants aren’t totally devoid of human interaction. Indeed, they often disguise a true human interaction. Many chatbots aren’t fully automated and rely on humans to pick up the slack from the code. More fully-constructed virtual assistants like you find in Amazon’s Echo or your Apple iPhone are carefully programmed by humans. The programming choices they make also define your interactions with the personalities—and these interactions can redefine how you treat people.

A clear indication that someone is truly polite and kind is treating service people with respect, patience, and kindness. The rise of chatbots and virtual assistants, however, means that you’re never quite sure whether you’re speaking to a human. You might think that people can easily tell the difference between when they’re interacting with humans and when they’re interacting with a voice inside a smart box, but as the technology behind virtual assistants like Google Assistant, Amazon Alexa, or used by call centers evolves, that will get harder to evaluate. (Even when you’re calling a call center, it can be hard to tell whether you’ve reached a well-programmed intake bot or a real person who’s fully in the groove of their phone voice).

I find it fascinating (and saddening) that the programmers of Google Assistant’s Duplex chose to program in “umms” and “mmhmms” and did not program in any kindness indicators. Instead the voices come across as impatient and slightly condescending. I listened to the sample clips linked by Ethan Marcotte in his post Kumiho, about Google Duplex. If virtual assistants don’t include programmed kindness, the emotional labor performed by service workers will continue to be too high. 

Programming to add kindness from virtual assistants is important, but so too is programming virtual assistants to expect kindness. We’re starting to be conditioned to treat chatbots as recipients for code-like commands, requiring a specific set of inputs, and those inputs do not acknowledge politeness.

It may seem overly-prescriptive, but in the same way that parents withhold items from their children until they “ask for it nicely”, it might be practical to include a “politeness mode” in virtual assistants. Hunter Walk wrote about how Amazon Alexa interactions are affecting his child, and Ben Hammersley blogged about the fact that there is no reward for politeness when he interacts with Amazon Alexa:

But there’s the rub. Alexa doesn’t acknowledge my thanks. There’s no banter, no trill of mutual appreciation, no silly little, “it is you who must be thanked” line. She just sits there sullenly, silently, ignoring my pleasantries.

And this is starting to feel weird, and makes me wonder if there’s an uncanny valley for politeness. Not one based on listening comprehension, or natural language parsing, but one based on the little rituals of social interaction. If I ask a person, say, what the weather is going to be, and they answer, I thank them, and they reply back to that thanks, and we part happy. If I ask Alexa what the weather is, and thank her, she ignores my thanks. I feel, insanely but even so, snubbed. Or worse, that I’ve snubbed her.”

“It’s the computing equivilent of being rude to waitresses. We shouldn’t allow it, and certainly not by lack of design. Worries about toddler screen time are nothing, compared to future worries about not inadvertently teaching your child to be rude to robots.

As virtual assistants become more common in day-to-day interactions, if they do not account for politeness, we might become a less kind society. Not only that, but impolite virtual assistants will add to the emotional labor performed by the service workers that don’t find their jobs replaced by technology.

Reflecting on a decade of (quantified) music listening

I recently crossed the 10 year mark of using Last.fm to track what I listen to.

From the first tape I owned (Train’s Drops of Jupiter) to the first CD (Cat Stevens Classics) to the first album I discovered by roaming the stacks at the public library (The Most Serene Republic Underwater Cinematographer) to the college radio station that shaped my adolescent music taste (WONC) to the college radio station that shaped my college experience (WESN), to the shift from tapes, to CDs, (and a radio walkman all the while), to the radio in my car, to SoundCloud and MP3 music blogs, to Grooveshark and later Spotify, with Windows Media Player and later an iTunes music library keeping me company throughout…. It’s been quite a journey.

Some, but not all, of that journey has been captured while using the service Last.fm for the last 10 years. Last.fm “scrobbles” what you listen to as you listen to it, keeping a record of your listening habits and behaviors. I decided to add all this data to Splunk, along with my iTunes library and a list of concerts I’ve attended over the years, to quantify my music listening, acquisition, and attendance habits. Let’s go.

What am I doing?

Before I get any data in, I have to know what questions I’m trying to answer, otherwise I won’t get the right data into Splunk (my data analysis system of choice, because I work there). Even if I get the right data into Splunk, I have to make sure that the right fields are there to do the analysis that I wanted. This helped me prioritize certain scripts over others to retrieve and clean my data (because I can’t code well enough to write my own).

I also made a list of the questions that I wanted to answer with my data, and coded the questions according to the types of data that I would need to answer the questions. Things like:

  • What percentage of the songs in iTunes have I listened to?
  • What is my artist distribution over time? Do I listen to more artists now? Different ones overall?
  • What is my listen count over time?
  • What genres are my favorite?
  • How have my top 10 artists shifted year over year?
  • How do my listening habits shift around a concert? Do I listen to that artist more, or not at all?
  • What songs did I listen to a lot a few years ago, but not since?
  • What personal one hit wonders do I have, where I listen to one song by an artist way more than any other of their songs?
  • What songs do I listen to that are in Spotify but not in iTunes (that I should buy, perhaps)?
  • How many listens does each service have? Do I have a service bias?
  • How many songs are in multiple services, implying that I’ve probably bought them?
  • What’s the lag between the date a song or album was released and my first listen?
  • What geographic locations are my favorite artists from?

As the list goes on, the questions get more complex and require an increasing number of data sources. So I prioritized what was simplest to start, and started getting data in.

 

Getting data in…

I knew I wanted as much music data as I could get into the system. However, SoundCloud isn’t providing developer API keys at the moment, and Spotify requires authentication, which is a little bit beyond my skills at the moment. MusicBrainz also has a lot of great data, but has intense rate-limiting so I knew I’d want a strategy to approach that metadata-gathering data source. I was left with three initial data sources: my iTunes library, my own list of concerts I’ve gone to, and my Last.fm account data.

Last.fm provides an endpoint that allows you to get the recent tracks played by a user, which was exactly what I wanted to analyze. I started by building an add-on for Last.fm with the Splunk Add-on Builder to call this REST endpoint. It was hard. When I first tried to do this a year and a half ago, the add-on builder didn’t yet support checkpointing, so I could only pull in data if I was actively listening and Splunk was on. Because I had installed Splunk on a laptop rather than a server in ~ the cloud ~, I was pretty limited in the data I could pull in. I pretty much abandoned the process until checkpointing was supported.

After the add-on builder started supporting checkpointing, I set it up again, but ran into issues. Everything from forgetting to specify the from date in my REST call to JSON path decision-making that meant I was limited in the number of results I could pull back at a time. I deleted the data from the add-on sourcetype many times, triple-checking the results each time before continuing.

I used a python script (thanks Reddit) to pull my historical data from Last.fm to add to Splunk, and to fill the gap between this initial backfill and the time it took me to get the add-on working, I used an NPM module. When you don’t know how to code, you’re at the mercy of the tools other people have developed. Adding the backfill data to Splunk also meant I had to adjust the max_days_ago default in props.conf, because Splunk doesn’t necessarily expect data from 10+ years ago by default. 2 scripts in 2 languages and 1 add-on builder later, I had a working solution and my Last.fm data in Splunk.

To get the iTunes data in, I used an iTunes to CSV script on Github (thanks StackExchange) to convert the library.xml file into CSV. This worked great, but again, it was in a language I don’t know (Ruby) and so I was at the mercy of a kind developer posting scripts on Github again. I was limited to whatever fields their script supported. This again only did backfill.

I’m still trying to sort out the regex and determine if it’s possible to parse the iTunes Library.xml file in its entirety and add it to Splunk without too much of a headache, and/or get it set up so that I can ad-hoc add new songs added to the library to Splunk without converting the entries some other way. Work in progress, but I’m pretty close to getting that working thanks to help from some regex gurus in the Splunk community.

For the concert data, I added the data I had into the Lookup File Editor app and was up and running. Because of some column header choices I made for how to organize my data, and the fact that I chose to maintain a lookup rather than add the information as events, I was up for some more adventures in search, but this data format made it easy to add new concerts as I attend them.

Answer these questions…with data!

I built a lot of dashboard panels. I wanted to answer the questions I mentioned earlier, along with some others. I was spurred on by my brother recommending a song to me to listen to. I was pretty sure I’d heard the song before, and decided to use data to verify it.

Screen image of a chart showing the earliest listens of tracks by the band VHS collection.

I’d first heard the song he recommended to me, Waiting on the Summer, in March. Hipster credibility: intact. Having this dashboard panel now lets me answer the questions “when was the first time I listened to an artist, and which songs did I hear first?”. I added a second panel later, to compare the earliest listens with the play counts of songs by the artist. Maybe the first song I’d heard by an artist was the most listened song, but often not.

Another question I wanted to answer was “how many concerts have I been to, and what’s the distribution in my concert attendance?”

Screen image showing concerts attended over time, with peaks in 2010 and 2017.

It’s pretty fun to look at this chart. I went to a few concerts while I was in high school, but never more than one a month and rarely more than a few per year. The pace picked up while I was in college, especially while I was dating someone that liked going to concerts. A slowdown as I studied abroad and finished college, then it picks up for a year as I get settled in a new town. But after I get settled in a long-term relationship, my concert attendance drops off, to where I’m going to fewer shows than I did in high school. As soon as I’m single again, that shifts dramatically and now I’m going to 1 or more show a month. The personal stories and patterns revealed by the data are the fun part for me.

I answered some more questions, especially those that could be answered by fun graphs, such as what states have my concentrated music listens?

Screen image of a map of the contiguous united states, with Illinois highlighted in dark blue, indicating 40+ concerts attended in that state, California highlighted in a paler blue indicating 20ish shows attended there, followed by Michigan in paler blue, and finally Ohio, Wisconsin, and Missouri in very pale blue. The rest of the states are white, indicating no shows attended in those states.

It’s easy to tell where I’ve spent most of my life living so far, but again the personal details tell a bigger story. I spent more time in Michigan than I have lived in California so far, but I’ve spent more time single in California so far, thus attending more concerts.

Speaking of California, I also wanted to see what my most-listened-to songs were since moving to California. I used a trellis visualization to split the songs by artist, allowing me to identify artists that were more popular with me than others.

Screen image showing a "trellis" visualization of top songs since moving to California. Notable songs are Carly Rae Jepsen "Run Away With Me" and Ariana Grande "Into You" and CHVRCHES with their songs High Enough to Carry You Over and Clearest Blue and Leave a Trace.

I really liked the CHVRCHES album Every Open Eye, so I have three songs from that album. I also spent some time with a four song playlist featuring Adele’s song Send My Love (To Your New Lover), Ariana Grande’s Into You, Carly Rae Jepsen’s Run Away With Me, and Ingrid Michaelson’s song Hell No. Somehow two breakup songs and two love songs were the perfect juxtaposition for a great playlist. I liked it enough to where all four songs are in this list (though only half of it is visible in this screenshot). That’s another secret behind the data.

I also wanted to do some more analytics on my concert data, and decided to figure out what my favorite venues were. I had some guesses, but wanted to see what the data said.

Screen image of most visited concert venues, with The Metro in Chicago taking the top spot with 6 visits, followed by First Midwest Bank Ampitheatre (5 visits), Fox Theater, Mezzanine, Regency Ballroom, The Greek Theatre, and The Independent with 3 visits each.

The Metro is my favorite venue in Chicago, so it’s no surprise that it came in first in the rankings (I also later corrected the data to make it its proper name, “Metro” so that I could drill down from the panel to a Google Maps search for the venue). First Midwest Bank Ampitheatre hosted Warped Tour, which I attended (apparently) 5 times over the years. Since moving to California it seems like I don’t have a favorite venue based on visits alone, but it’s really The Independent, followed by Bill Graham Civic Auditorium, which doesn’t even make this list. Number of visits doesn’t automatically equate to favorite.

But what does it MEAN?

I could do data analysis like that all day. But what else do I learn by just looking at the data itself?

I can tell that Last.fm didn’t handle the shift to mobile and portable devices very well. It thrives when all of your listening happens on your laptop, and it can grab the scrobbles from your iPod or other device when you plug it into your computer. But as soon as internet-connected devices got popular (and I started using them), listens scrobbled overall dropped. In addition to devices, the rise of streaming music on sites like Grooveshark and SoundCloud to replace the shift from MediaFire-hosted and MegaUpload-hosted free music shared on music blogs also meant trouble for my data integrity. Last.fm didn’t handle listens on the web then, and only handles them through a fragile extension now.

Two graphs depicting distinct song listens and distinct artist listens, respectively, with a peak and steady listens through 2008-2012, then it drops down to a trough in 2014 before coming up to half the amount of 2010 and rising slightly.

Distinct songs and artists listened to in Last.fm data.But that’s not the whole story. I also got a job and started working in an environment where I couldn’t listen to music at work, so wasn’t listening to music there, and also wasn’t listening to music at home much either due to other circumstances. Given that the count plummets to near-zero, it’s possible there were also data issues at play.  It’s imperfect, but still fascinating.

What else did I learn?

Screen image showing 5 dashboard panels. Clockwise, the upper left shows a trending indicator of concerts attended per month, displaying 1 for the month of December and a net decrease of 4 from the previous month. The next shows the overall number of concerts attended, 87 shows. The next shows the number of iTunes library songs with no listens: 4272. The second to last shows a pie chart showing that nearly 30% of the songs have 0 listens, 23% have 1 listen, and the rest are a variety of listen counts. The last indicator shows the total number of songs in my iTunes library, or 16202.

I have a lot of songs in my iTunes library. I haven’t listened to nearly 30% of them. I’ve listened to nearly 25% of them only once. That’s the majority of my music library. If I split that by rating, however, it would get a lot more interesting. Soon.

You can’t see the fallout from my own personal Music-ocalypse in this data, because the Library.xml file doesn’t know which songs don’t point to actual files, or at least my version of it doesn’t. I’ll need more high-fidelity data to determine the “actual” size of my library, and perform more analyses.

I need more data in general, and more patience, to perform the analyses to answer the more complex questions I want to answer, like my listening habits of particular artists around a concert. As it is, this is a really exciting start.

If you want more details about the actual Splunking I did to do these analyses, I’ll be posting a blog on the official Splunk blog. That got posted on January 4th! Here it is: 10 Years of Listens: Analyzing My Music Data with Splunk.

Yoga Beta for Climbers

As a companion to Finding Yourself on the Wall, sometimes what you need while climbing isn’t real beta or advice of what to do, but mental reinforcement. This beta can sound kind of like the mantras that someone might give you in the midst of a yoga class—yoga beta.

  • Do what feels right
  • Don’t forget to breathe
  • Don’t look, just feel
  • You are stronger than you think
  • Just let go 

 

Finding Myself on the Wall

How climbing teaches me to manage my fear and love myself.

Sometimes I find myself on the wall doing something I never thought possible: holding onto something that doesn’t seem to have a place to hold, or reaching something that looks out of reach. Other times it’s like I’m waking up to find myself trapped in what seems to be an inescapable spot: no holds above me, or nowhere to put my feet to push myself higher. In these cases, the problem is clear. The solution isn’t.

In climbing, the problem can be on the wall, or it can be with my confidence, or my fear. Being able to consistently test solutions, push through challenges, and conquer the problem is what makes climbing a perfect mental and physical outlet for me.

For me, climbing is all about managing fear and trusting myself. I have to manage my natural instincts of being afraid of heights and of falling. I also have to learn to trust my abilities and skills while respecting myself and my boundaries in order to avoid getting hurt or endangering myself or others.

In addition, the different types of climbing require different levels of this fear management and self-trust. I first learned top-rope climbing, but as I got better I got more comfortable. Then I learned bouldering, and got more comfortable there, so I learned how to lead climb. Throughout this process, I’ve built my physical strength and climbing technique, but also self confidence and my ability to manage fear.

  • Top-roping is the most comfortable form of climbing for me. I can see the rope, and I can sometimes see the anchor keeping the rope secure. I can feel the taut lack of slack in my rope, and lean back from the wall to test it. I can rest at any time as well, so there is time to slow down and take breaks. All of this physical security reinforces a psychological sense of security, which can help me do more challenging moves and climb higher than I might otherwise feel comfortable climbing.
  • Bouldering requires me to stomach my fear and muster my self-confidence to take me to the top of a wall, or over the top of a wall, without a rope. Bouldering routes are typically anywhere from 10-20 feet high in a gym, and in some incredible outdoor routes, 40 or more feet high. Without the physical security of a rope or an anchor, I have to know my physical and psychological strengths and limits before I start. This forces me to scope out the route before I start climbing, and prepare myself to jump or fall to the ground if I feel uncomfortable. Bouldering forces me to get used to this discomfort and either overcome it or recognize when it is valid and to listen to it.
  • Lead climbing takes the height of top-roping and combines it with the mental aspects of bouldering. No longer do I have the visible anchor or a taut rope to help me feel safe—it’s just me and the wall. I’m conquering the problem while also taking all the necessary steps to keep myself safe: clip properly, climb safely around the rope, and rest when I can. There’s little to no room for fear.

Each type of climbing removes an element of physical security and further challenges my psychological security as I progress. In this way, I’ve been forced to progressively confront and challenge my limits at the same time that I learn to respect and recognize them.

The dangers of climbing are real. It’s an extreme sport. Though it doesn’t always feel dangerous in a gym, any time that you are high up in the air relying on humans and equipment, something can fail and you can die. It’s also easy to get injured due to bad technique: over-gripping holds, inadequately engaging muscles, straining hand muscles and tendons on hard-to-grip holds. If anything, these risks force me to prioritize muscle recovery and rest days, allowing me to recognize that just as physical self care is important, so too is psychological self care.

Despite these risks, climbing lets me get more in-tune with myself than anything else that I’ve tried. It’s a wall of problems, but each one is recognizable and each one is solvable, and I can try them again and again. I can learn by watching someone else solve it, but I can’t solve it the same way because we have different skill sets, physical strength, and body types. I still have solve the problem myself in my own way.

Climbing with other people has also been key to my mental strength. Climbing partners are vital to my safety, but also to my confidence level. They can encourage me to try new routes, and give me beta when I start to falter on a route. Beta, typically defined as information about a route, can also involve encouragement. Everything from the tactical “there’s a foothold by your right knee” to the encouraging “you can reach it!” to the calming “don’t look, just feel” is great beta that has helped me succeed. (I’ve named that last type Yoga Beta). Even so, sometimes the best beta is silence so that I can focus on the problem.

Climbing as a method for teaching myself that I can succeed and iterating my way through problem-solving helps me overcome my fear of failure. I’m learning to trust myself to get through each move, and find something to (physically, psychologically) support myself along the way. I have to trust myself, and the rock, every step of the way.

Sofar Sounds: So far from DIY

I attended my first Sofar Sounds show on Friday night. It was a great night, attending a show with a friend and making a few new friends with those that we were sharing a couch with. Sofar Sounds hosts shows in secret locations, from people’s houses, apartments, or even offices, that their community offers up.

As someone who went to a lot of DIY shows in college, the show was a bit surreal. The Sofar show had all the trappings of a DIY show: a crowd of people who care about music, who’ve all traveled there to see a show, a show organized by someone they know or another friend knows. Except in this case, the someone they know is instead a company that they’ve paid money to, and they don’t know the artists or the promoters or even the location until the day of the show.

The DIY shows I’ve been to were characterized by familiar faces, familiar locations, but also hardworking dedicated musicians and music lovers doing the promotion, organization, and crowd-wrangling. Shows in living rooms, basements, kitchens, and garages. Realizing years later that you should’ve worn earplugs. A network of people that once you break into, you can start going to even more shows in more and more places over the years as people move in or away.

The crux of it being, of course, breaking into the network. How do you find a community of like-minded people who have the resources to host and promote house shows, and are doing just that? Sofar Sounds takes people who have the resources to host house shows and connects them with bands and a predefined, curated audience. Sofar shows are like the Lyft of DIY shows, and the commercialization feels somewhat awkward. Sofar Sounds takes the DIY show model and tries to “solve” it with a business model.

Emma Silvers covers that business model in depth in her article A New Guest at Your House Show: The Middleman for KQED Arts. No longer do the musicians and music lovers have to do the promotion, organization, and crowd-wrangling on their own. Instead, they operate as volunteer “ambassadors” for Sofar Sounds, but don’t get paid and still have to do crowd-wrangling. The musicians, on the other hand, might not get paid anything at all. The audience is largely formed of strangers, selected based on applications for tickets by Sofar Sounds.

Perhaps because of all this, the community at the Sofar show felt constructed. Our “ambassador” made me feel like I was alternately at a sporting event or at a team-building exercise with his efforts to pump up the crowd and get us to bond with each other at the same time. Thankfully no one tried to fist bump me. We were all united in our love of music and our willingness to obey the rules about when we were supposed to leave or when we were supposed to talk. Overall, it felt distinctly constructed, rather than a true community of repeated faces like the DIY experimental/punk scene I’d known before.

At the show, I met some people that I enjoyed talking to and would want to see again. Therein lies another problem with the constructed community—it’s building a network behind Sofar Sounds, not behind the bands themselves. I may never see the people I met at the show again, because the network and the community of people attending the show is so far removed from those promoting and organizing it.

Nevertheless, the DIY network has limited reach, and Sofar might help bands break out of that network. If your DIY shows are always performed for your friends and your family, how do you attract new people? Building an organic community takes time, dedication, commitment, but doesn’t exactly pay you quickly. Sofar might act as a kind of shortcut to getting your music in front of new people that might not otherwise stumble into your community. Even if it also doesn’t pay you quickly, and even if they hear your one show and you never see them again.

What does this business model mean for local bars that host music? How many of the artists actually make money from the $15 cover that we’re willing to commit to this ~experience~? Perhaps next time I’ll spend my money at Hotel Utah Saloon or another local venue without the secrecy or the middleman. Maybe the next Sofar show should take you by surprise by happening at an existing venue, with local bands that get a full cut of the ticket cost. Of course, then we’re back at venues full of people that ostensibly don’t care about the music but rather the night they’re trying to have despite it.

As another alternate to Sofar Sounds, Group Muse also operates on the more commercialized house show model, but hearkens back to an even earlier method of hosting house shows—the era of chamber music. Capitalizing on the goodwill of hosts, the shows happen in the same sorts of venues as Sofar Sounds shows, and feature classical chamber music instead of more mainstream singer songwriter type music. However, musicians are paid by the audience, so the platform operates more realistically as a platform rather than a true middleman.

I’ve similarly been to one Group Muse event, and found it a great exposure to a type of music I wouldn’t have sought out otherwise. And that seems to be the real fun behind the secret show atmosphere. Someone has brought you to a place, you have already paid, and you have pretty low expectations of what you might be listening to. The openness that makes any sort of house show a success is already there. So overall, I can’t really complain too much.

The music industry has been disrupted a lot by technological advances, but artists continue to not get paid enough for what they do for us. So go to secret shows. Find house shows, find DIY shows, go to Sofar shows, go to Group Muses, and check out the acts playing at your local bar. Go out, dance hard, and pay up. It’s worth it.

 

Data as a Gift: Implications for Product Design

The idea of data as a gift, and the act of sharing data as an exchange of a gift, has data ethics and privacy implications for product and service design.

Recent work by Kadija Ferryman and Nick Seaver on data as a gift in the last year addressed this concept more broadly and brought it to my attention. Ferryman, in her piece Reframing Data as a Gift, took the angle of data sharing in the context of health data and open data policies. Seaver, in his piece Return of the Gift, approached it from the angle of the gift economy and big data. Both make great points that are relevant in the context of data collection and ethics, especially as it relates to data security and privacy more generally.

Ferryman introduces the concept brilliantly:

What happens when we think about data as a gift? Well, first, we move away from thinking about data in the usual way, as a thing, as a repository of information and begin to think of it as an action. Second, we see that there is an obligation to give back, or reciprocate when data is given. And third, we can imagine that giving a lot of data has the potential to create tension.

When you frame the information that we “voluntarily” share with services as a gift, the dynamics of the exchange shift. We can’t truly share data with digital services—that implies that we retain ultimate ownership over the data. You can take back something after you share it with them. But you can’t do that with your personal data. Because you can’t take back your data after you share it, you can more accurately conceptualize the exchange of data with digital services as a gift. Something you give, and which cannot be returned to you (at least not in its original form).

Data as a gift creates an expectation or obligation for a return, Seaver makes clear. Problem is, when we’re sharing data on the internet, we don’t always know exactly what we’re giving and what we’re getting.

The gift exchange might be based on the expectation that your data is used to provide the service to you. And the more data, the better the service (you might expect). For this reason, it seems easier to share specific types of data with specific services. For example, it’s easier for me to answer questions about my communication or sexual preferences with a company if I think I’m going to get a boyfriend out of the exchange, and sharing that data might make it more likely.

But what happens if a company stops seeing (or doesn’t ever see) an exchange of data as a gift exchange, and starts using the data you gift it for whatever it wants in order to make a profit? By violating the terms of the gift exchange, the company violates the implicit social contract you made with the company when you gifted your data. This is where privacy comes in. Gifting information for one purpose and having it used for other unexpected purposes feels like a violation of privacy. Because it is.

A violation of the gift exchange of data is a privacy violation, but it feels like the norm now. It’s common in terms of services to be informed that after you gift your data to a service, it is no longer yours and the company can do with it what it wants.

Products and services are designed so that you can’t pay for them even if you want to. You must share certain amounts of data, and if you don’t, the product doesn’t work. As Andrew Lewis put it, “If you are not paying for it, you’re not the customer; you’re the product being sold.” We didn’t end up there because we are that dedicated to free things on the Internet. We were lured into gifting our data in exchange for specific, limited services, and the companies realized later that the data was the profitable part of the exchange.

Nick Seaver refers to this as “The obligation to give one’s data in exchange for the use of “free” services,” and it is indeed an obligation. To avoid gifting your data to services that you might not want to enter into that type of exchange, you have very few ways to interact with the modern Internet. You’d likely also have to have a lot of money, in order to enter into a paid transaction rather than a gift exchange with a company in return for services.

For those of us working in product or service development, we can use this perspective and consider the social contract of the exchange of data gifts.

  • Consider whether the service you offer is on par with the amount of data you ask people to gift to you.
    • Do I really need to share my Facebook likes with Tinder to get a superior match?
  • Consider whether the service you offer can deliver on the obligations and expectations created by the gift exchange.
    • Is your service rewarding enough and trustworthy enough to where I’ll save my credit card information?
  • Consider whether you can design your service to allow people to choose the data that they want to gift to you.
    • What is the minimum-possible data gift that a person could exchange with your service, and still feel as though their gift was reciprocated?
  • Consider the type of gift exchange that you design if you force people to gift you a specific type or amount of data.
    • Is that an expectation or obligation that you want to create?

When you view each piece of information that a person shares with you as a gift, it’s harder to misuse that information.

 

Note: Thanks to Clive Thompson for bringing Kadija Ferryman’s piece to my attention, and Nick Seaver for sharing his piece Return of the Gift with me on Twitter. 

Feature Names Matter

When someone starts using your software, they need to build an understanding of how it works and how the pieces interact. The UI text you write and the feature names you choose can build or break a mental model.

From a marketing perspective, the importance of the name is clear. You want something catchy, marketable, searchable, memorable, all these things. But most importantly, a feature name must help a user build a mental model of what your feature does.

The mental model helps the user understand why they might use this feature, and what for. One of the riskiest part of shipping something new is adoption. If people don’t know what it does or how it works, they won’t use it. A crucial element to that understanding is what you call the new thing and how you describe it in the product. If I can’t guess based on the name what it does, I might not click on it at all and explore it.

Let’s look at some feature names…

  • Google+ vs GoogleDocs. One of these is pretty opaque, and the other is pretty clear. I might think that Google Docs is google FOR docs, but as soon as I click into it, I’ll see what it is and understand that it’s for writing docs. I might never click into Google+ because I have no idea what it is based on the name.
  • Dropbox vs Box. There’s a reason both of these companies are named practically the same thing. Because you put things in boxes that you want to share and store. It’s a super evocative mental model, so it gets a bit overused, perhaps.
  • Slack vs HipChat. HipChat is a bit more descriptive, but you know automatically that it’s a chat app. Slack turns a verb into a noun, and hopes that you start using it and understand that you slack off while using it… kind of.

It’s harder to come up with examples in software of things that truly failed, because they aren’t very well known. But the example that brought this to life for me is from a card game I learned how to play recently. Red7 uses the concept of a “canvas” and a “palette” to tie the metaphor of color across the game. But combining those concepts with the established mental model that you have in a card game with a discard pile and a hand of cards took quite a bit of work. In reality, the clever metaphor broke down and impeded what could have been quick understanding by burdening an existing card game mental model with a mental model of painting ephemera. It was marketable, but not intuitive because it didn’t help people build a mental model to understand how the game works.

The simplest way to pick a good feature name is to test them out. Do some word association exercises with your team, but also with people that don’t work on your team and don’t even work in software. Diverse teams matter a lot in this exercise. This can help identify names that build mental models, break them, or are irrevocably associated with irrelevant mental models.

Another way to pick good feature names is to rely on scenarios when building features. That way, you’re less likely to conceptualize a feature based on its architecture, or your internal team structures, and more likely to think of it from a problem-solving perspective. If you know exactly what the feature is doing, and for whom, it’s easier to pick a useful name.

 

 

Tips for live tweeting an event

If you use Twitter and are attending an event that you want to share with your twitter followers, you can live tweet it as it’s happening. While you can live tweet basically any event, these tips focus mainly on talks that you might attend as part of a conference, a meetup, a sponsored speaker series, or another presentation.

I’ve live tweeted several conferences (two as part of a job, such as #SUMIT14), talks, and series of talks as @smorewithface.

First, the basics on live tweeting an event, then some pro tips and best practices to follow before and during the event.

Continue reading

2015 Resolutions, 2016 Music, and 2017

In 2015 I made some resolutions. I haven’t followed up on them since. In 2016 I made no resolutions, but I listened to a lot of music.

How the 2015 resolutions fared in 2015 and 2016

I did okay.

1. Stay off Twitter more, read fewer articles on the web, and create more.

I’ve continued to use Twitter over the past couple years. My use of it waxes and wanes depending on the news. I periodically delete it from my phone to get a break. My Pocket queue and a change in my commute mean that I’ve certainly shifted my web-based reading habits. I read 73 books in 2016 (that I bothered to add to Goodreads, so let’s round to 75). I’ve taken a few vacations without my laptop, and have spent a bit more time working with my hands—whether at the gym, climbing, or making things like jewelry.

2. Learn JavaScript

I still have not learned JavaScript. I gave up on it. Part of this is because semicolons are rude, and part of this is because I joined a company where Python is the primary backend language. I realized Python would be easier to learn and my interests shifted more toward data analysis (and maybe some digital mapping?) and further from interactive web content, so I worked on learning Python instead. I ran into the same issue learning Python as I did JS though… it requires a lot of time, and a lot of practice. Side projects are hard to maintain, especially when they’re similar to your day job. I still have some tabs open about learning Python, and I took more Python tutorials, so I have at least reading-level knowledge of both languages.

3. Read something huge, and write something huge.

Still haven’t read Gödel Escher Bach: An Eternal Golden Braid. Still haven’t written up a magnum opus on the reification of geographical and political and linguistic borders on the web. I’m hoping to break it apart and start publishing bits and pieces here this year. We’ll see.

Goals for 2017

I’m meeting with friends this weekend to work through some. I’m setting some in terms of reading material, to better improve my knowledge of key (also depressing) moments in history. I can’t be a history nerd without being fascinated by dictators, oppression, and systemic discrimination. So I’m branching out to different geographical dictators and oppression in history this year. Beyond that, we’ll see what goals I end up with.

2016 in Music

2016 was a music-filled year, after a multi-year hiatus of drifting away from it being as central in my life as it deserves. These songs stuck with me for various reasons. They helped me regain some hipster credibility, branch out into less-explored genres, and reminded me how important it is to support artists you care about.

Songs that stuck

This year I bought the CHVRCHES album, the Bon Iver album, the Frank Ocean album, and The Little Prince soundtrack. In addition to a bunch of one-off songs, because shuffle is how I roll.

I listened to Jason Derulo a lot, and listened to a four song playlist of Carly Rae Jepsen, Ariana Grande, Ingrid Michaelson, and Adele for longer than I’d like to admit.

Shows I saw

I saw the following artists play live:

  • Justice and Sebastian (he might still go by SebastiAn)
  • Broods and Two Door Cinema Club
  • Daughter
  • Still Flyin’, Annie Hart of Au Revoir Simone, and Slow Club
  • Tallest Man on Earth and The Head and the Heart
  • Cold War Kids
  • CHVRCHES

If I had to rank them, CHVRCHES would be first.

I started the new year by binging on SoundCloud recommendations from The Burning Ear and the related tracks that SoundCloud plays. A great way to fall into a rabbit hole of discovery, and a totally different experience from Spotify’s more carefully-constructed and curated experience of the Discover Weekly and other playlists.

Stats that were gathered for me, passively by last.fm

Last.FM tells me things about last year in music too.

 

My listening increased after I moved. CHVRCHES was a continual favorite. I listened to a lot of different types of music, but mostly stuck to indie. Hey Rosetta! – Trish’s Song is a great song to listen to if you’re trying to fall asleep.

2017 in Music

In 2017 I have tickets to see these artists in concert…

  • Less Than Jake
  • Mike Doughty and Wheatus
  • Jens Lekman
  • Radical Face
  • Matt Pond PA
  • Gibbz
  • Knox Hamilton and Colony House

And that’s just the first three months. City life suits me. Forging ahead into 2017 suits me. Here’s to more reading, more music, more learning, more blogging, and more self care in the year ahead.