Music trends and data errors: 2022 in music

In 2022, I had no true “obsessions” in my music listening, unlike last year. Instead of any standout artists, I flitted from artist to artist as they released new albums or other things prompted me to rediscover how much I enjoyed their music.

This was a year for breadth, rather than depth, and also for discovering the limits of my music data collection mechanisms.

If you want to skip around, here’s a table of contents:

Top artists of 2022

My top artists of 2022 had a lot of familiar names.

Stacked column chart showing my top 10 artists for 2022 by month. Top 10 artists were Caribou, Fred Again.., Frightened Rabbit, HAAi, Jacques Greene, Logic1000, Monkey Safari, salute, Totally Enormous Extinct Dinosaurs, and TSHA. Notable patterns described in text.

Rather than any one consistent artist, the real pattern was one of shifting obsessions. Let’s zoom in.

February: Caribou

Stacked column chart showing my top 10 artists by listens for the year, with only the bars from January, February, and March visible.

When 2022 kicked off, I barely listened to Caribou. I listened to 3 songs in January. In February, I went to their concert on a Wednesday at the Fox Theater.

Area graph showing intermittent small number of listens for January, then a vertical line indicating that I went to a concert, then a spike of sustained listens for several days immediately after the concert, then a break, then another spike that peters off through April, when the graph ends.

Their live show blew me out of the water. I was entranced, captive to the flow of the music as they played Sun for what felt like twice as long as the studio track duration. In an attempt to recapture that feeling, I listened to Caribou a lot in the weeks after the show, and as a result they ended up one of my most-consistently listened to artists of 2022.

June: HAAi

Stacked column chart showing my top 10 artists by listens for the year, with only the bars from May and June and part of July visible. The line for 50 listens is in the middle of the image, and the line for 100 listens is just below the top of the bar for June.

I discovered HAAi on October 08, 2018, listening to her track Be Good. I have no recollection of this, but the track is familiar to me. I didn’t listen to any new tracks by her until 2020, but the next one I heard, FEELS, was one I’ve since listened to 26 times.

Nevertheless, when I heard that she was coming out with a debut album, and it was produced with Jon Hopkins, I was thrilled.

Column chart showing barely any listening activity in April or May, then a huge spike in listens to nearly 75 in June, a drop below 25 in July, and then roughly 10 or less for the remaining months of the year.

The audio for her single from the eponymous album, Baby We’re Ascending, was released on May 4, 2022, and the music video was released on June 1st.

HAAi’s debut album came out on May 27, 2022, and on June 1st and June 2nd, I listened to the whole album all the way through.

I added her album to my library on June 8, 2022, and on June 18th, I listened to HAAi tracks 34 times. Judging by the frequency of listens on that day, there was also a data error afflicting those results, but I listened to the album all the way through at least once that day.

September and October: TSHA

Multicolored stacked column chart showing listening patterns for top 10 artists in September and October. Two teal bars at the top in September and October represent Fred Again.. listens, a brown bar in the middle of September represents listens to Totally Enormous Extinct Dinosaurs, and a smaller purple bar in September and one twice the size in October represent the listens for TSHA.

Not as prominent as other months and other obsessions, I still spent a lot of time listening to TSHA in September and October this year.

TSHA is another artist that I listened to quite a bit over the last couple of years, but didn’t yet have a full album out. She released her highly anticipated (at least by me) debut album Capricorn Sun in October, and it quickly went into strong rotation for me.

Three of my top 10 songs of the year are TSHA tracks, and she was one of the artists that I listened to at least once every month this year. I finally saw her DJ for a bit at Portola Music Festival, but I can’t wait to see her throw down for a headlining set.

November: Frightened Rabbit

Stacked column chart showing top 10 artist listens for November and December this year, completely dominated by blue bars representing 65 and 50 listens for each month of Frightened Rabbit, out of a total 75 and 110 listens for each month.

Frightened Rabbit is a band I’ve listened to for a long time.

The first song of theirs that I added to my iTunes library was Last Tango in Brooklyn, off a demo album, on October 1, 2009. That was also the first track I heard of theirs, on Sunday, November 15, 2009. However, I haven’t listened to them in a while.

Column chart depicting listening patterns for Frightened Rabbit this year. There are no columns until November, with roughly 60 listens, and December, with 50 listens.

So what prompted me to start listening to them again in November, and with such fervor?

I updated my podcast feed.

A friend of mine had recommended the Object of Sound podcast, with Hanif Abdurraqib, a poet and essayist that he was a fan of. I’d added the podcast to my feed but hadn’t yet listened to it, and in early November I updated my podcast feed and saw the episode for When It’s All Gone, Something Carries On (A Tribute to Scott Hutchison).

That episode was released on November 4, 2022, and seeing that podcast in my feed, prompted me to revisit their music.

Column chart depicting listening patterns for Frightened Rabbit during November and December, with 34 listens on November 15, 17 listens on November 21, 12 on November 22, 3 on November 29 (explained in surrounding text), 25 on December 1st, 19 on December 2nd, and 7 on December 4th.

Subsequently, I listened to Frightened Rabbit intensely throughout November. I listened to Spotify albums downloaded offline for the entire 6+ hour flight back from my Thanksgiving travels, so that isn’t reflected in the data for November 29.

I finally listened to the podcast on the bus to the airport on November 29, and it moved me so much that I cried.

I kept up the intensity of my listening for a few days into December, but since then, the intensity of my listening has dwindled.

For reference and comparison, I saw Frightened Rabbit live in 2013, long before their frontman died, and my post-concert peak listening was 13 times in one day.

Area graph showing all time listens of Frightened Rabbit, from 2009 until 2013, with a date in 2013 marked by a vertical line to indicate that I saw them live that day. Surrounding text describes relevant patterns.

All of those days of listens are overshadowed by my listening patterns in November and December this year.

Another mini obsession: Hadiya George

Beyond those three notable artist-of-the-month interludes, I had another intense listening stretch that wasn’t notable enough to make it to the top 10 artists, but did crack the top 10 songs.

Table showing the top 10 songs of 2022, led by Beyonce’s Break my soul with 17 listens, followed by salute’s track Honey with 16 listens, tied with Hadiya George’s track described here, also with 16 listens, followed by TSHA’s track Water featuring Oumou SangarĂ© with 15 listens, TSHA’s track Giving Up featuring Mafro with 14 listens, Griff’s track Say it Again remixed by TSHA with 14 listens, Bonobo’s track ATK with 13 listens, Fred Again..’s track Jungle with 13 listens, TSHA’s track Moon with 13 listens, and Prospa’s track WANT NEED LOVE also with 13 listens.

Hadiya George’s track Hot Flavor, remixed by Godmode Smash Brothers, was one of my top tracks of the year.

I discovered the track on August 21, 2022, and then listened to it 14 times the next day. Over the next couple of months I listened to that track, or the extended remix, once a day on 4 occasions. And that’s it!

Column chart showing listens to Hadiya George in 2022. Roughly 16 listens in August and 3 in September, and no other listens shown.

As far as I can discern, I discovered the track on SoundCloud, liking it and reposting it the first day I heard it. August 21st is a Sunday, and SoundCloud Weekly refreshes late on Sunday night for me, so presumably that’s where I heard it for the first time.

It’s also possible that the track ended up in the autoplay recommendations when I listened to Carly Rae Jepsen’s track Beach House at the suggestion of a friend. I find that SoundCloud’s autoplay is pretty unmatched, at least for my music taste.

I listened to the track mostly on YouTube, because that’s the only place I could find the track, forgetting that I had discovered it on SoundCloud.

Eventually I purchased the extended version, and that’s where the listens at the end of September came from.

Consistent favorites

In addition to the moments when I dug deep into an artist’s catalog, I had some standby favorites of the year.

salute

The producer and artist salute was one of my favorite artists of 2022. My eighth-most-listened-to artist of the year, those listens were concentrated on just 15 tracks, making him an outlier in terms of intensity of my listening activity for the top 10 artists.

Table showing distribution of listens across tracks for that artist for the top 10 artists. Most artists have an average number of listens per track from about 2 to 3, with TSHA having 4.31 average listens per track, HAAi having 4.19 average listens per track, while salute has 5.07 listens per track, more than any of the top 10 artists this year.

In addition to making an appearance on my top 10 artists, he also made an appearance on my top songs, with his track Honey being my second-most-listened-to track of the year. He doesn’t have an album out yet, otherwise he might have shown up there as well.

Column chart showing listening patterns to salute this year, with no listens in January, about 4 in February, 5 in March, about 8 in April, none in May, 1 in June, 18 in July, 16 in August, 10 in September, 7 in October, none in November, and 7 in December.

I didn’t listen to salute every month of the year, but I enjoyed his music all the same.

I first discovered him in June 05, 2017 15:28:03, with a track called Weigh it Up (featuring Krrum). I have to imagine it was on Spotify, because while salute is great, this track isn’t really my style. Spotify put another track from salute on my Discover Weekly the next month, his track Light Up, which has a similar high production vibe that is more typical of standard EDM.

The first track of his to really kick, for me, was Want U There, which I first heard last summer June 29, 2021 10:35:05, which has a much more UK garage backbeat and vibe.

Fred Again..

My top artist of last year, Fred was also my top artist this year—but by a much slimmer margin than last year.

Table showing top 10 artists of the year by listens, Fred Again.. is listed first with 191 listens, TSHA next with 136 listens, Caribou with 122, Frightened Rabbit with 117, HAAi with 109, Monkey Safari with 79, salute with 75, Jacques Greene with 74, Totally Enormous Extinct Dinosaurs with 74, and Logic1000 with 65.

He was one of six artists that I listened to every month this year, joining Caribou, DJ Seinfeld, Logic1000, TSHA, and warner case.

Table showing the top 10 most consistently listened to artists of 2022, with Caribou all 12 months and 125 total listens, DJ Seinfeld in all 12 months and 59 total listens, Fred Again.. for all 12 months and 192 total listens, Logic1000 for all 12 months and 66 total listens, TSHA for all 12 months and 138 total listens, and warner case for all 12 months and 17 total listens.

On the other hand, I only saw him live once despite him playing San Francisco 3 times this year. I was underwhelmed by his first live appearance the weekend before Coachella, despite being stoked enough to show up early and stake out a front row spot at Great American Music Hall.

Having 3 albums out certainly helped him beat out the other contenders for top artist in terms of sheer output—the 193 total listens for this year are spread across 69 different tracks.

Column chart showing listening patterns for Fred Again.. during 2022, with 16 or 17 listens in January and February, 5 in March, 25 in April, 6 in May, 7 in June, 15 in July, 19 in August, 39 in September, 35 in October, 3 in November, and 9 in December.

Most of my listening activity was concentrated in September and October, when his third album, Actual Life 3, was released.

Logic1000

Logic1000 making a stealth top 10 appearance. I didn’t consciously listen to her this year, but she squeaked in as one of my most consistently listened to artists of the year as well.

Column chart showing listening patterns for Logic1000 throughout the year, with 8 listens in January, 6 in February, 4 in March, 3 in April, 2 in May, 4 in June, 11 in July, 13 in August, 4 in September, 3 in October, 2 in November, and 5 in December.

She came out with a few new tracks this year that I enjoyed, notably Rush and Can’t Stop Thinking About.

I also added 2 of my favorite songs of hers, YourLove and Safe in My Arms, to my library this year. Given that I moved away from listening to music on Spotify this year, that helped as well.

She joins salute in the list of artists that I’m hoping releases an album soon, but in the meantime I’m enjoying her singles.

I had plans to see her play earlier this year, but she cancelled her entire tour hours before I was supposed to see her. I hope she’s well.

Listening habits over time

As I mentioned in my Spotify Wrapped post this year, I made a concerted effort to diversify where I was listening to music. I took some time off from working this year, and that also contributed to changes in my listening behavior this year.

In general, the amount of time I spent listening was greatly reduced this year, most notably removed from the workday.

Three clusters of column charts, one cluster for 2020 showing nearly 40K listens during workday hours, and roughly 20K listens during evening hours, with much lower volumes for other times. The 2021 cluster shows 48K listens during workday hours, 16K during evening hours, and evenly low volumes for other times. The 2022 cluster shows about 18K listens during workday hours, about 11K listens during evening hours, and roughly the same low volumes during other hours as in 2021.

Overall, my daily listening habits flattened, no longer spiking on the weekdays while I filled my time with music while working.

Three clusters of column charts, one cluster for 2020 showing listening habits by day of the week, where weekday listening habits are clustered around 12K per day, and weekend listens around 5k or less. The 2021 cluster has a little more variability but the same rough pattern. The 2022 cluster shows all days at roughly 5K or below, with saturday and sunday being the lowest around 4K each, but wednesday is the highest with roughly 6K listens.

While my overall listening volume is lower, you can also see the dip mid-year, when I’m working less. That coincided with moving in with my partner, a pattern that in the past has led me to listen to less music in general.

Three clusters of column charts, this time showing monthly listening habits for each year. The months in 2020 all hovered around 5K listens per month, give or take 1K, with a spike in December at 8K. The months in 2021 similarly hovered around 5K listens per month, with nearly 8K in January and a low point of around 3K in July. For 2022, the clusters all hover around 2.5K listens per month, with a high point of about 4K in August and a low point of roughly 1K listens in May.

Where did my money go?

This year, I wanted to put some more effort into figuring out who gets paid when I’m listening to music. It’s still a rough estimate, since I don’t know how much of my listening activity happened in which platform, but it’s something.

If I assume a $1 average price paid per song, this chart shows the amount that artists earned for purchases that I made on Bandcamp, and for purchases that I made on the iTunes store.

Two stacked bars showing amount spent at Bandcamp, $175, and the share sent to artists as roughly $140, and another bar showing roughly $60 spent at iTunes, with roughly $50 going to artists.

While Bandcamp is extremely transparent about how much money they pass on to artists, every other platform is more opaque and steeped in layers of “it depends”. For the purchase calculations, I’m assuming 82% of the price is passed on to the artists when purchased from Bandcamp.

For iTunes, I’m assuming that 70% of the payment is passed to artists when I buy a track from the iTunes store. I relied on an article from The Guardian How much do musicians really make from Spotify, iTunes and YouTube?, published in 2015, for the iTunes rate details.

In an attempt to determine how much money artists earned from my streaming behavior, I charted out the amount artists would earn if all of my listens for the year occurred on a given service.

three different bars, one showing roughly $27 earned by artists if all of my listening activity occurred on SoundCloud, roughly $37 if it all occurred on Spotify, and roughly $6 if it all occurred on YouTube.

For the royalty rates from streaming services, I relied on the Streaming Royalty Calculator from Omari MC, which gave the following rates:

All streaming rates end up being approximations, because the actual payout depends on the revenue for the streaming service, the share of an artist’s listens compared to total listens at the streaming service, as well as any negotiations made by the artist’s record label with the streaming service, if applicable.

While streaming revenue is recurring revenue, it’s wildly lower than the revenue that artists get when someone buys their music. Given that I still use streaming services, I think it’s a great idea to continue purchasing music.

Data errors and my top song of the year

When I first went through this analysis of my music habits, I discovered that my top song of the year was Quiet Little Voices by We Were Promised Jetpacks.

Table showing top 10 songs of the year, raw data, with Quiet Little Voices by We Were Promised Jetpacks listed first with 74 total listens, followed by Crush Club’s song Louder (Lefti Remix) with 41 listens, Fred Again..’s track Eazi (Do It Now) with 29 listens, Lokua Kanza’s song MbiffĂ© with 23 listens, Ingrid Michaelson’s song Hell No with 18 listens, Jamie xx’s song I Know There’s Gonna Be (Good Times) featuring Young Thug and Popcaan with 18 listens, then Beyonce with Break My Soul with 17 listens, salute’s song Honey with 16 listens, Hadiya George’s track with 16 listens, and Caribou with Never Come Back with 16 listens.

This was a bit of a surprise, because while I love that song, it isn’t one I had a strong memory of listening to frequently, let alone more than any other song of the year. So I dug a bit deeper, and that’s where things started to fall apart.

It turned out that all the scrobbles for that track were on June 8, 2022. On that Wednesday morning, I took the bus to an appointment, walked to the gym for a workout, then took the bus home.

According to my data, while I did that I listened exclusively to Quiet Little Voices on repeat from 4:30AM until 8:20AM, and then I rotated Fred Again..’s track Eazi (Do it Now) and Jamie xx’s track I Know There’s Gonna Be (Good Times) [feat. Young Thug and Popcaan] into the mix from 8:30AM until 10:00AM.

Starting at 11:00AM until 2:30PM I followed a similar pattern of listening to 7 different songs at impossible frequencies.

Why was this impossible?

In order to count a track as a listen (scrobble, in Last.fm parlance), the following needs to be true 1:

  • The track must be longer than 30 seconds.
  • And the track has been played for at least half its duration, or for 4 minutes (whichever occurs earlier.)

That means that according to the data, I was sometimes listening to as many as 5 songs in a 5-minute period. However, the shortest song I was ostensibly listening to is nearly 3 minutes long, and most tracks were at least 4 minutes or longer 2.

For those songs, I had a number of impossible track frequencies throughout the time period in the morning, and the second time period in the afternoon:

track_name listens
Quiet Little Voices 73
Louder (Lefti Remix) 35
Eazi (DoItNow) 23
Mbiffé 20
Hell No 15
I Know There’s Gonna Be (Good Times) [feat. Young Thug & Popcaan] 13
Never Come Back 12
Start Again 7
Inyani Feat.Oluhle & Aaaron 3
O’Flynn - SGD (Soundbwoy Killah Remix) 3
Sorry (Greene Edit) 3
exe.cute 3
Mir a nero (Original Mix) 2
For Sarah (Live DJ Mix) 1

After I discovered this, I identified and excluded the data in Splunk 3 so that I could choose on a search-by-search basis whether to exclude this anomalous day from my data.

I thought that was the end of it — discovered a specific anomaly in the data on one day, remove that day from the data, and move on.

More errors than I realized

A month later, I discovered that the problems went deeper than I thought. There were more errors beyond just this one day. As part of analyzing my music data to compare it with Spotify Wrapped, I determined whether I was habitually more loyal to songs (listening to the same songs frequently in a given time period) or sought out variety.

To do this, I collected my listening behavior into 20 minute bins and evaluated the number of repeated tracks in each bin, then counted the number of bins with repeated tracks (loyal) and different tracks (variety):

`lastfm` |  eval exclude=if(uts>=1654688186 AND uts<=1654723907, "true", "false")
    | search exclude=false
    | bin _time span=20m
| stats count by track_name,_time
| eval habit=if(count>1, "loyal", "variety")
| stats count by habit

After separating my listening habits into those 20 minute bins and assessing loyalty compared to variety, I wanted to know which tracks I was ostensibly loyal to:

`lastfm`
| eval exclude=if(uts>=1654688186 AND uts<=1654723907, "true", "false") | search exclude=false
| bin _time span=20m
| stats count by track_name,_time
| eval habit=if(count>1, "loyal", "variety")
| search habit=loyal
| sort -count, _time

And this is where the data errors show up. A lot of my “most loyal” tracks were seemingly double or triple scrobble submissions, almost certainly by the mobile scrobbling app, based on the times and days when the data collection errors occurred. I did a deep dive into why this might be happening 4, but I still wanted to see if I could consistently identify the data errors.

Based on an average track length of about 4 minutes, I first isolated the number of 20-minute bins that had more than 5 songs in them.

`lastfm`
| bin _time span=20m
| stats count, list(track_name) as tracks by_time
| search count >5
| sort -count, _time

There were a number of legitimate-seeming track patterns in those bins, so I narrowed it further to isolate the more improbable time periods wherein I listened to at least 7 songs in 20 minutes:

`lastfm`
| bin _time span=20m
| eval year = strftime(_time,"%Y")
| stats count, list(track_name) as tracks list(artist) as artist by _time, year
| search count >7
| sort -count,_time

I figured that any instances of 7 tracks scrobbled in a 20-minute period is almost certainly a data error. Given that most tracks I listen to are an average of 4 minutes long, 7 tracks in 20 minutes would mean that most tracks were 3 minutes each or less.

If I use an improbably high estimate of 8 songs per 20-minute bin, I can track down the worst of the worst bins over the years. If I focus on the last few years specifically, there’s quite a lot of issues:

year count
2020 61
2021 84
2022 24

Oh no. Let’s see what the worst bin offenders were:

Looking at just the bins from 2022, the worst ones are part of the time period I identified on June 8:

This is disheartening, but also a relief that for this year, I’ve excluded the most egregious outliers in my analysis.

Can I trust the data?

Back to the topic at hand—why am I digging into these data anomalies and errors at all? Because I use my Last.fm data to prop up my shoddy memory, I rely on it to reinforce or remind me about my favorite songs in a year.

If I can identify issues with the data based on statistical anomalies, and better yet, identify how the data is created so that I can trust my data. If I can understand how the data is created, I can improve its quality going forward, and clean up past data errors as well.

Last.fm recognizes the importance of high-quality data, and filters API requests sent to the scrobbling endpoint. Some relevant message codes:

They recognize that there are logical improbabilities in the data that would indicate errors upstream, and throw that data out.

In the case of my music data, I have two or three sources for listening data, or play count data, for a track:

Each of these records a listen, or a play, slightly differently.

I don’t have any single ground truth measure to validate the data in Last.fm, but I can cobble something together from iTunes.

If I return to the first outlier track that led me down this rabbit hole, I can see that the overall play count for We Were Promised Jetpacks' track Quiet Little Voices according to iTunes is 73. However, Last.fm lists the plays for that track for just this year alone as 74 plays, with 136 total plays.

This track has been in my iTunes library since 2009, so let’s go back in time. If I refer to a Library.XML file for my iTunes library from 2016, I can see that the play count for that track was 61:

    <key>Play Count</key><integer>61</integer>

That’s more listens than Last.fm was aware of through that time, but Last.fm’s ability to scrobble music was much more rudimentary then. The bar to count something as a play in iTunes is also much higher than the threshold to count a track as scrobbled.

What caused this data error?

There are patterns in the data discrepancies, or some likely attributions that I can come up with, like:

Ultimately, it feels like the more I dig, the more it feels like the listening data I have is like the points on the improvisational comedy show Whose Line is it Anyway: made up, and they don’t matter.

I track my music data for fun. From a data analysis perspective, it’s a challenge that there is no single source of truth that can provide “ground truth” accuracy for my music listening data, but it’s a great reminder that the same is true for most data sources in the world.

We rely on imperfect data collection methods to identify patterns and draw conclusions. It’s only by having a deep understanding of how the data came to exist, and audits throughout the analysis process, that we can be confident in the results of the data analysis.

2022 in music, and what’s in store for 2023?

2022 was a unique year in music listening. My listening volume was down, my listening habits were spread across a lot of different services, but I still spent time with artists whose music I treasure. Here’s to more UK garage in 2023, some new albums from upcoming artists, and even more discoveries.


  1. When is a scrobble a scrobble? in the Last.fm API documentation. ↩︎

  2. The following table outlines the tracks and durations that I listened to during that time period:

    track_name artist duration
    Quiet Little Voices We Were Promised Jetpacks 4:21
    Louder (Lefti Remix) Crush Club 4:07
    Eazi (DoItNow) Fred again.. 3:37
    Mbiffé Lokua Kanza 4:02
    Hell No Ingrid Michaelson 2:55
    I Know There’s Gonna Be (Good Times) [feat. Young Thug & Popcaan] Jamie xx 3:34
    Never Come Back Caribou 5:05
    Start Again Kidnap 4:00
    Inyani Feat.Oluhle & Aaaron Re.You 7:15
    O’Flynn - SGD (Soundbwoy Killah Remix) Hundred Flowers Records 7:36
    Sorry (Greene Edit) Jacques Greene 6:00
    exe.cute Marc DePulse 6:55
    Mir a nero (Original Mix) Michel Cleis 12:15
    For Sarah (Live DJ Mix) Tourist 4:48
     ↩︎
  3. I wrote the following evaluation statement, and then searched for the data that didn’t match it:

    | eval exclude=if(uts>=1654688186 AND uts<=1654723907, "true", "false")
    | search exclude=false
    

    This approach meant I could identify the erroneous data without automatically excluding it. ↩︎

  4. I dug into alternate scrobbling apps, in case that was the issue, and discovered one called Eavescrob, which has an FAQ that includes the following note:

    Due to the limitation of iOS system, iOS only keeps the last time point of a song you’ve played, so currently the repeated plays are timestamped based on your last played date. Let’s hope a better solution would come to iOS eventually.

    And thus I had a kernel of information to attempt to track down. In an attempt to locate the Apple Music API endpoint or framework this information came from, I dug into the open source code for Finale, another Last.fm scrobbling app. I couldn’t find any leads, so I reached out to the developer on Twitter, and he was kind enough to point me to another repo, where he has the code for retrieving the information.

    It turns out that the Apple Music API endpoint, get recently played tracks, is not the main source of information. Instead, it’s a Media Player framework which permits queries of specific media items that have been played. That lastPlayedDate variable confirms that the data that is collected is, in fact, the most recent play date for the item (rather than an array with the history). You can perform a query with any elements of an MPMediaItem. In the case of the Finale scrobbling app, the developer queries the songs and then filters them for those where the lastPlayedDate is after a specific time.

    It’s always a challenge to use an API or a framework in a somewhat adjacent manner to which it has been intended, and this is likely a case where, if the framework had been designed to communicate all playback activity, that would be possible. But in the context of, say, writing a shuffle algorithm, and meeting more clear and internal product needs, the framework would only need to know how recently the track had been played — not the entire track playback history.

    And so we hang out in the niche and acquire subpar data for our purposes.

    Presuming that the Last.fm scrobbling app for iOS works the same way as the Finale app, or at least using the same framework, an issue with cached data, a malformed query, something else, or some combination of all of those that likely caused the data corruption on June 8 and other dates. ↩︎

  5. “Plays are recorded when a user initiates song playback in Apple Music for more than 30 seconds.”, according to Understand your analytics in the Apple Music for Artists documentation. ↩︎

  6. This is according to a discussion on Reddit, and another on StackExchange. The conversation on StackExchange from 2011 about What does “plays” really count in iTunes? points out:

    play count == number of times the file played right to the very end.

    The conversation on Reddit on the Apple Music subreddit, What counts as a “play count”? from 3 years ago affirms that: listening “Up to less than :10 of the song ending.” is what counts as a play in iTunes.

    The official documentation for playCount in the Media Player framework and the MusicKit Library documentation is, as anticipated, vague and useless:

    The number of times the user plays the media item.

    or alternately:

    The number of times the user played the song.

    In my own experience, tested while writing this, I can confirm that the denizens of the internet forums are correct, and the play count on Apple Music, formerly known as iTunes, increments only when a song has been played in its entirety. ↩︎

  7. “Song stream: Counted when someone listens for 30 seconds or more”, according to How we count streams in the Spotify for Artists documentation. ↩︎

  8. See the earlier definition of what a scrobble is, according to Last.fm. It occurred to me as well that since Last.fm will send the “currently playing” track as well as the history of tracks played, that it could be an anomaly there, but I discard the “currently playing” events in the Last.fm Add-on for Splunk configuration, and I further deduplicate the data at search time by unix timestamp (uts) so if the timestamp matched exactly, I wouldn’t count duplicates. ↩︎

  9. That last explanation is probably the case with a 20-minute bin from earlier this month, 2022-12-04 23:20:00, where I listened to these 8 tracks in a row:

    • Orbiting
    • Can’t Go Back Now
    • Takes so Long
    • Antarctica
    • Wish I Could Forget
    • Thunder
    • Hideaway
    • Orbiting

    Many or most of these songs are under 3 minutes long, which can explain why they nearly fit in a 20 minute time block:

    track_name duration
    Orbiting 2:55
    Can’t Go Back Now 2:18
    Takes so Long 3:07
    Antarctica 3:15
    Wish I Could Forget 2:57
    Thunder 3:24
    Hideaway 2:47

    All of these songs are by The Weepies, except for Thunder, which is a track by Imagine Dragons. I must’ve been playing Heardle in the middle of listening to a bunch of mopey folk songs by The Weepies, and listened to that on Spotify, which has a lower threshold for counting something as played than iTunes does. ↩︎