Listening to Music while Sheltering in Place

The world is, to varying degrees, sheltering-in-place during this global coronavirus pandemic. Starting in March, the pandemic started to affect me personally: 

  • I started working from home on March 6th. 
  • Governor Gavin Newsom announced on March 11 that any gatherings over 250 people were strongly discouraged, effectively cancelling all concerts for the month of March. 
  • On March 16th, the mayor of San Francisco along with several other counties in the area, announced a shelter-in-place order. 

Ever since then, I’ve been at home. Given all these changes in my life, I was curious what new patterns I might see in my music listening habits. 

With large gatherings prohibited, I went to my last concert on March 7th. With gatherings increasingly cancelled nationwide, and touring musicians postponing and cancelling events, March 27th, Beatport hosted the first livestream festival, “ReConnect. A Global Music Series”. Many more followed. 

Industry-wide studies and data analysis have attempted to unpack various trends in the pandemic’s influence on the music industry. Analytics startup Chartmetric is digging into genre-based listening, geographical listening habits, and Billboard and Nielsen conducting a periodic entertainment tracker survey.

Because I’m me, and I have so much data about my music listening patterns, I wanted to explore what trends might be emerging in my personal habits. I analyzed the months March, April, and May during 2020, and in some cases compared that period against the same period in 2019, 2018, and 2017. The screenshots of data visualizations in this blog post represent data points from May 15th, so it is an incomplete analysis and comparison, given that May in 2020 is not yet complete. 

Looking at my listening habits during this time period, with key dates highlighted, it’s clear that the very beginning of the crisis didn’t have much of an effect on my listening behavior. However, after the shelter-in-place order, the amount of time I spent listening to music increased. After that increase it’s remained fairly steady.

Screenshot of an area chart depicting listening duration ranging from 100 minutes with a couple spikes of 500 minutes but hovering around a max of 250 minutes per day for much of january and february, then starting in march a new range from about 250 to 450 minutes per day, with a couple outliers of nearly 700 minutes of listening activity, and a couple outliers with only a 90 minutes of listening activity.

Key dates such as the first case in the United States, the first case in California, and the first case in the Bay Area are highlighted along with other pandemic-relevant dates.

Listening behavior during March, April, and May over time

When I started my analysis, I looked at my basic listening count from traditional music listening sources. I use Last.fm to scrobble my listening behavior in iTunes, Spotify, and the web from sites like YouTube, SoundCloud, Bandcamp, Hype Machine, and more. 

Chart depicting 2700 total listens for 2017, 2000 total listens for 2018, and 2300 total listens for 2019 during March, April, and May, compared to 3000 total listens in that same period in 2020.

If you just look at 2018 to 2020, it seems like my listening habits are trending upward, maybe with a culmination in 2020. But comparing against 2017, it isn’t much of a difference. I listened to 25% fewer tracks in 2018 compared with 2017, 19% more tracks in 2019 compared with 2018, and 25% more tracks in 2020 compared with 2019. 

Chart depicting total weekday listens during March, April, and May during 2017, 2018, 2019, and 2020 with total weekend listens during the same time. 2017 shows roughly 2400 listens on weekdays and 200ish for 2017, 2000 weekday listens vs 100 weekend listens for 2018, 2100 weekday listens vs 300 weekend listens in 2019, and 2500 weekday listens vs 200 weekend listens in 2020

If I break that down by when I was listening by comparing my weekend and weekday listening habits from the previous 3 years to now, there’s still perhaps a bit of an increase, but nothing much. 

With just the data points from Last.fm, there aren’t really any notable patterns. But number of tracks listened to on Spotify, SoundCloud, YouTube, or iTunes provides an incomplete perspective of my listening habits. If I expand the data I’m analyzing to include other types of listening—concerts attended and livestreams watched—and change the data point that I’m analyzing to the amount of time that I spend listening, instead of the number of tracks that I’ve listened to, it gets a bit more interesting. 

Chart shows roughly 12000 minutes spent listening in 2017, 10000 in 2018, 12000 in 2019, and 22000 in 2020While the number of tracks I listened to from 2019 to 2020 increased only 25%, the amount of time I spent listening to music increased by 74%, a full 150 hours more than the previous year during this time period. And May isn’t even over yet! 

It’s worth briefly noting that I’m estimating, rather than directly calculating, the amount of time spent listening to music tracks and attending live music events. To make this calculation, I’m using an estimate of 3 hours for each concert attended, 4 hours for each DJ set attended, 8 hours for each festival attended, and an estimate of 4 minutes for each track listened to, based on the average of all the tracks I’ve purchased over the past two years. Livestreamed sets are easier to track, but some of those are estimates as well because I didn’t start keeping track until the end of April.

I spent an extra 150 hours listening to music this year during this time—but when was I spending this time listening? If I break down the amount of time I spent listening by weekend compared with weekdays, it’s obvious:

Chart depicts 10000 weekday minutes and 5000 weekend minutes spent listening in 2017, 9500 weekday minutes and 4500 weekend minutes in 2018, 14000 weekday minutes and 2000 weekend minutes in 2019, and 12000 weekday minutes and 13000 weekend minutes in 2020

Before shelter-in-place, I’d spend most of my weekends outside, hanging out with friends, or attending concerts, DJ sets, and the occasional day party. Now that I’m spending my weekends largely inside and at home, coupled with the number of livestreaming festivals, I’m spending much more of that time listening to music. 

I was curious if perhaps working from home might reveal new weekday listening habits too, but the pattern remains fairly consistent. I also haven’t worked from home for an extended period before, so I don’t have a baseline to compare it with. 

It’s clear that weekends are when I’m doing most of my new listening, and that this new listening likely isn’t coming from my traditional listening habits. If I split the amount of time that I spend listening to music by the type of listening that I’m doing, the source of the added time spent listening is clear.

Depicts 11000 minutes of track listens and 1000 minutes of time spent at concerts in 2017, 8000 minutes spent listening to music tracks and 2000 minutes spent at concerts in 2018, 10000 minutes spent listening to music tracks and 3000 minutes spent at concerts in 2019, and 12000 minutes spent listening to music tracks and 9000 minutes listening to livestreams, with a sliver of 120 minutes spent at a single concert in 2020

Hello, livestreams. If you look closely you can also spy the sliver of a concert that I attended on March 7th.

Livestreams dominate, and so does Shazam

All of the livestreams I’ve been watching have primarily been DJ sets. Ordinarily, when I’m at a DJ set, I spend a good amount of time Shazamming the tracks I’m hearing. I want to identify the tracks that I’m enjoying so much on the dancefloor so I can track them down, buy them, and dig into the back catalog of those artists. 

So I requested my Shazam data to see what’s happening now that I’m home, with unlimited, shameless, and convenient access to Shazam.

For the time period that I have Shazam data for, the correlation of Shazam activity to number of livestreams watched is fairly consistent at roughly 10 successful Shazams per livestream.  

Chart details largely duplicated in surrounding text, but of note is a spike of 6 livestreams with only 30 or so songs shazammed, while the next few weeks show a fairly tight interlock of shazam activity with number of livestreams

Given the correlation of Shazam data, as well as the continued focus on watching DJ sets, I wanted to explore my artist discovery statistics as well. Especially when it seemed like my listening activity hadn’t shifted much, I was betting that my artist discovery statistics have been increasing during this time. If I look at just the past few years, there seems to be a direct increase during this time period. 

Chart depicts 260ish artists discovered in March, April, and May of 2018, 280 discovered in 2019, and 360 discovered in 2020Chart depicts 260ish artists discovered in March, April, and May of 2018, 280 discovered in 2019, and 360 discovered in 2020. Second chart shows the same data but adds 2017, with 390 artists discovered

However, after I add 2017 into the list as well, the pattern doesn’t seem like much of a pattern at all. Perhaps by the end of May, there will be a correlation or an outsized increase. But at least for now, the added number of livestreams I’ve been watching don’t seem to be producing an equivalently high number of artist discoveries, even though they’re elevated compared with the last two years. 

That could also be that the artists I’m discovering in the livestreams haven’t yet had a substantial effect on my non-livestream listening patterns, even if there’s 91 hours of music (and counting) in my quarandjed playlist where I store the tracks that catch my ear in a quarantine DJ set. Adding music to a playlist, of course, is not the same thing as listening to it. 

Livestreaming as concert replacement?

Shelter-in-place brought with it a slew of event cancellations and postponements. My live events calendar was severely affected. As of now, 15 concerts were affected in the following ways:

Chart depicts 6 concerts cancelled and 9 postponed

The amount of time that I spend at concerts compared with watching livestreams is also starkly different.

Chart depicts 1000 minutes spent at concerts in 2017, 2000 minutes at concerts in 2018, 2500 minutes at concerts in 2019, and 8000 minutes spent watching livestreams, with a topper of 120 minutes at a concert in 2020

I’ve spent 151 hours (and counting) watching livestreams, the rough equivalent of 50 concerts—my entire concert attendance of last year. This is almost certainly because I’m often listening to livestreams, rather than watching them happen.

Concerts require dedication—a period of time where you can’t really do anything else, a monetary investment, and travel to and from the show. Livestreams don’t have any of that, save a voluntary donation. That makes it easier to turn on a stream while I’m doing other things. While listening to a livestream, I often avoid engaging with the streaming experience. Unless the chat is a cozy few hundred folks at most, it’s a tire fire of trolls and not a pleasant experience. That, coupled with the fact that sitting on my couch watching a screen is inherently less engaging than standing in a club with music and people surrounding me, means that I’m often multitasking while livestreams are happening.

The attraction for me is that these streams are live, and they’re an event to tune into, and if you don’t, you might miss it. Because it’s live, you have the opportunity to create a shared collective experience. The chatrooms that accompany live video streams on YouTube, Twitch, and especially with Facebook’s Watch Party feature for Facebook Live videos, are what foster this shared experience. For me, it’s about that experience, so much so that I started a chat thread for Jamie xx’s 2020 Essential Mix so that my friends and I could experience and react to the set live. This personal experience is contrary to the conclusion drawn in this article on Hypebot called Our Music Consumption Habits Are Changing, But Will They Remain That Way? by Bobby Owsinski: “Given the choice, people would rather watch something than just listen.”. Given the choice, I’d rather have a shared collective experience with music rather than just sit alone on my couch and listen to it. 

Of course, with shelter-in-place, I haven’t been given a choice between attending concerts and watching livestreamed shows. It’s clear that without a choice, I’ll take whatever approximation of live music I can find.

 

Problems with Indexing Datasets like Web Pages

Google has created a dataset search for researchers or the average person looking for datasets. On the one hand, this is a cool idea. Datasets are hard to find in cases, and this ostensibly makes the datasets and accompanying research easier to find.
In my opinion this dataset search is problematic for two main reasons.

1. Positioning Google as a one-stop-shop for research is risky.

There’s consistent evidence that many people (especially college students who don’t work with their library) start and end their research with Google, rather than using scholarly databases, limiting the potential quality of their research. (There’s also something to be said here about the limiting of access to quality research behind exploitative and exclusionary paywalls, but that’s for another discussion).
Google’s business goal of being the first and last stop for information hunts makes sense for them as a company. But such a goal doesn’t necessarily improve academic research, or the knowledge that people derive based on information returned from search results.

2. Datasets without datasheets easily lead to bias.

The dataset search is clearly focused on indexing and making more available as many datasets as possible. The cost of that is continuing sloppy data analysis and research due to the lack of standardized Datasheets for Datasets (for example) that fully expose the contents and limitations of datasets.
The existing information about these datasets is constructed based on the schema defined by the dataset author, or perhaps more specifically, the site hosting the dataset. It’s encouraging that datasets have dates associated with them, but I’m curious where the description for the datasets are coming from.
Only the description and the name fields for the dataset are required before a dataset appears in the search. As such, the dataset search has limitations. Is the description for a given dataset any higher quality than the Knowledge Panels that show up in some Google search results? How can we as users independently validate the accuracy of the dataset schema information?
The quality of and details provided in the description field vary widely across various datasets (I did a cursory scan of datasets resulting from a keyword search for “cheese”) indicating that having a plain text required field doesn’t do much to assure quality and valuable information.
When datasets are easier to find, that can lead to better data insights for data analysts. However, it can just as easily lead to off-base analyses if someone misuses data that they found based on a keyword search, either intentionally or, more likely, because they don’t fully understand the limitations of a dataset.
Some vital limitations to understand when selecting one for use in data analysis are things like:
  • What does the data cover?
  • Who collected the data?
  • For what purpose was the data collected?
  • What features exist in the data?
  • Which fields were collected and which were derived?
  • If fields were derived, how were they derived?
  • What assumptions were made when collecting the data?

Without these valuable limitations being made as visible as the datasets themselves, I struggle to feel overly encouraged by this dataset search in its current form.

Ultimately, making information more easily accessible while removing or obscuring indicators that can help researchers assess the quality of the information is risky and creates new burdens for researchers.

Unbiased data analysis with the data-to-everything platform: unpacking the Splunk rebrand in an era of ethical data concerns

Splunk software provides powerful data collection, analysis, and reporting functionality. The new slogan, “data is for doing”, alongside taglines like “the data-to-everything platform” and “turn data into answers” want to bring the company to the forefront of data powerhouses, where it rightly belongs (I’m biased, I work for Splunk).

There is nuance in those phrases that can’t be adequately expressed in marketing materials, but that are crucial for doing ethical and unbiased data analysis, helping you find ultimately better answers with your data and do even better things with it.

Start with the question

If you start attempting to analyze data without an understanding of a question you’re trying to answer, you’re going to have a bad time. This is something I really appreciate about moving away from the slogan “listen to your data” (even though I love a good music pun). Listening to your data implies that you should start with the data, when in fact you should start with what you want to know and why you want to know it. You start with a question.

Data analysis starts with a question, and because I’m me, I want to answer a fairly complex question: what kind of music do I like to listen to? This overall question, also called an objective function in data science, can direct my data analysis. But first, I want to evaluate my question. If I’m going to turn my data into doing, I want to consider the ethics and the bias of my question.

Consider what you want to know, and why you want to know it so that you can consider the ethics of the question. 

  • Is this question ethical to ask? 
  • Is it ethical to use data to answer it? 
  • Could you ask a different question that would be more ethical and still help you find useful, actionable answers? 
  • Does my question contain inherent bias? 
  • How might the biases in my question affect the results of my data analysis? 

Questions like “How can we identify fans of this artist so that we can charge them more money for tickets?” or “What’s the highest fee that we can add to tickets where people will still buy the tickets?” could be good for business, or help increase profits, but they’re unethical. You’d be using data to take actions that are unfair, unequal, and unethical. Just because Splunk software can help you bring data to everything doesn’t mean that you should. 

Break down the question into answerable pieces

If my question is something that I’ve considered ethical to use data to help answer, then it’s time to consider how I’ll perform my data analysis. I want to be sure I consider the following about my question, before I try to answer it:

  • Is this question small enough to answer with data?
  • What data do I need to help me answer this question?
  • How much data do I need to help me answer this question?

I can turn data into answers, but I have to be careful about the answers that I look for. If I don’t consider the small questions that make up the big question, I might end up with biased answers. (For more on this, see my .conf17 talk with Celeste Tretto).

So if I consider “What kind of music do I like to listen to?”, I might recognize right away that the question is too broad. There are many things that could change the answer to that question. I’ll want to consider how my subjective preferences (what I like listening to) might change depending on what I’m doing at the time: commuting, working out, writing technical documentation, or hanging out on the couch. I need to break the question down further. 

A list of questions that might help me answer my overall question could be: 

  • What music do I listen to while I’m working? When am I usually working?
  • What music do I listen to while I’m commuting? When am I usually commuting?
  • What music do I listen to when I’m relaxing? When am I usually relaxing?
  • What are some characteristics of the music that I listen to?
  • What music do I listen to more frequently than other music?
  • What music have I purchased or added to a library? 
  • What information about my music taste isn’t captured in data?
  • Do I like all the music that I listen to?

As I’m breaking down the larger question of “What kind of music do I like to listen to?”, the most important question I can ask is “What kind of music do I think I like to listen to?”. This question matters because data analysis isn’t as simple as turning data into answers. That can make for catchy marketing, but the nuance here lies in using the data you have to reduce uncertainty about what you think the answer might be. The book How to Measure Anything by Douglas Hubbard covers this concept of data analysis as uncertainty reduction in great detail, but essentially the crux is that for a sufficiently valuable and complex question, there is no single objective answer (or else we would’ve found it already!). 

So I must consider, right at the start, what I think the answer (or answers) to my overall question might be. Since I want to know what kind of music I like, I therefore want to ask myself what kind of music I think I might like. Because “liking” and “kind of music” are subjective characteristics, there can be no single true answer that is objective truth. Very few, if any, complex questions have objectively true answers, especially those that can be found in data. 

So I can’t turn data into answers for my overall question, “What kind of music do I like?” but I can turn it into answers for more simple questions that are rooted in fact. The questions I listed earlier are much easier to answer with data, with relative certainty, because I broke up the complex, somewhat subjective question into many objective questions. 

Consider the data you have

After you have your questions, look for the answers! Consider the data that you have, and whether or not it is sufficient and appropriate to answer the questions. 

The flexibility of Splunk software means that you don’t have to consider the questions you’ll ask of the data before you ingest it. Structured or unstructured, you can ask questions of your data, but you might have to work harder to fully understand the context of the data to accurately interpret it. 

Before you analyze and interpret the data, you’ll want to gather context about the data, like:

  • Is the dataset complete? If not, what data is missing?
  • Is the data correct? If not, in what ways could it be biased or inaccurate?
  • Is the data similar to other datasets you’re using? If not, how is it different?

This additional metadata (data about your datasets) can provide crucial context necessary to accurately analyze and interpret data in an unbiased way. For example, if I know there is data missing in my analysis, I need to consider how to account for that missing data. I can add additional (relevant and useful) data, or I can acknowledge how the missing data might or might not affect the answers I get.

After gathering context about your datasets, you’ll also want to consider if the data is appropriate to answer the question(s) that you want to answer. 

In my case, I’ll want to assess the following aspects of the datasets: 

  • Is using the audio features API data from Spotify the best way to identify characteristics in music I listen to? 
  • Could another dataset be better? 
  • Should I make my own dataset? 
  • Does the data available to me align with what matters for my data analysis? 

You can see a small way that the journalist Matt Daniels of The Pudding considered the data relevant to answer the question “How popular is male falsetto?” for the Vox YouTube series Earworm starting at 1:45 in this clip. For about 90 seconds, Matt and the host of the show, Estelle Caswell, discuss the process of selecting the right data to answer their question, including discussing the size of the dataset (eventually choosing a smaller, but more relevant, dataset) to answer their question. 

Is more data always better? 

Data is valuable when it’s in context and applied with consideration for the problem that I’m trying to solve. Collecting data about my schedule may seem overly-intrusive or irrelevant, but if it’s applied to a broader question of “what kind of music do I like to listen to?” it can add valuable insights and possibly shift the possible overall answer, because I’ve applied that additional data with consideration for the question that I’m trying to answer.

Splunk published a white paper to accompany the rebranding, and it contains some excellent points. One of them that I want to explore further is the question:

“how complete, how smart, are these decisions if you’re ignoring vast swaths of your data?” 

On the one hand, having more data available can be valuable. I am able to get a more valuable answer to “what kind of music do I like” because I’m able to consider additional, seemingly irrelevant data about how I spend my time while I’m listening to music. However, there are many times when you want to ignore vast swaths of your data. 

The most important aspect to consider when adding data to your analysis is not quantity, but quality. Rather than focusing on how much data you might be ignoring, I’d suggest instead focusing on which data you might be ignoring, for which questions, and affecting which answers. You might have a lot of ignored data, but put your focus on the small amount of data that can make a big difference in the answers you find in the data.

As the academics in “I got more data, my model is more refined, but my estimator is getting worse! Am I just dumb?” make clear with their crucial finding:

“More data lead to better conclusions only when we know how to take advantage of their information. In other words, size does matter, but only if it is used appropriately.”

The most important aspect of adding data to an analysis is exactly as the academics point out: it’s only more helpful if you know what to do with it. If you aren’t sure how to use additional data you have access to, it can distract you from what you’re trying to answer, or even make it harder to find useful answers because of the scale of the data you’re attempting to analyze. 

Douglas Hubbard in the book How to Measure Anything makes the case that doing data analysis is not about gathering the most data possible to produce the best answer possible. Instead, it’s about measuring to reduce uncertainty in the possible answers and measuring only what you need to know to make a better decision (based on the results of your data analysis). As a result, such a focused analysis often doesn’t require large amounts of data — rough calculations and small samples of data are often enough. More data might lead to greater precision in your answer, but it’s a tradeoff between time, effort, cost, and precision. (I also blogged about the high-level concepts in the book).

If I want to answer my question “What kind of music do I like to listen to?” I don’t need the listening data of every user on the Last.fm service, nor do I need metadata for songs I’ve never heard to help me identify song characteristics I might like. Because I want to answer a specific question, it’s important that I identify the specific data that I need to answer it—restricted by affected user, existence in another dataset, time range, type, or whatever else.

If you want more evidence, the notion that more data is always better is also neatly upended by the Nielsen-Norman Group in Why You Only Need to Test with 5 Users and the follow-up How Many Test Users in a Usability Study?.

Keep context alongside the data

Indeed, the white paper talks about bringing people to a world where they can take action without worrying about where their data is, or where it comes from. But it’s important to still consider where the data comes from, even if you aren’t having to worry about it because you use Splunk software. It’s relevant to data analysis to keep context about the data alongside the data.

For example, it’s important for me to keep track of the fact that the song characteristics I might use to identify the type of music I like come from a dataset crafted by Spotify, or that my listening behavior is tracked by the service Last.fm. Last.fm can only track certain types of listening behavior on certain devices, and Spotify has their own biases in creating a set of audio characteristics.

If I lose track of this seemingly-mundane context when analyzing my data, I can potentially incorrectly interpret my data and/or draw inaccurate conclusions about what kind of music I like to listen to, based purely on the limitations of the data available to me. If I don’t know where my data is coming from, or what it represents, then it’s easy to find biased answers to questions, even though I’m using data to answer them.

If you have more data than you need, this also makes keeping context close to your data more difficult. The more data, the more room for error when trying to track contextual meaning. Splunk software includes metadata fields for data that can help you keep some context with the data, such as where it came from, but other types of context you’d need to track yourself.

More data can not only complicate your analysis, but it can also create security and privacy concerns if you keep a lot of data around and for longer than you need it. If I want to know what kind of music I like to listen to, I might be comfortable doing data analysis to answer that question, identifying the characteristics of music that I like, and then removing all of the raw data that led me to that conclusion out of privacy or security concerns. Or I could drop the metadata for all songs that I’ve ever listened to, and keep only the metadata for some songs. I’d want to consider, again, how much data I really need to keep around. 

Turn data into answers—mostly

So I’ve broken down my overall question into smaller, more answerable questions, I’ve considered the data I have, and I’ve kept the context alongside the data I have. Now I can finally turn it into answers, just like I was promised!

It turns out I can take a corpus of my personal listening data and combine it with a dataset of my personal music libraries to weight the songs in the listening dataset. I can also assess the frequency of listens to further weight the songs in my analysis and formulate a ranking of songs in order of how much I like them. I’d probably also want to split that ranking by what I was doing while I was listening to the music, to eliminate outliers from the dataset that might bias the results. All the small questions that feed into the overall question are coming to life.

After I have that ranking, I could use additional metadata from another source, such as the Spotify audio features API, to identify the characteristics of the top-ranked songs, and ostensibly then be able to answer my overall question: what kind of music do I like to listen to?

By following all these steps, I turned my data into answers! And now I can turn my data into doing, by taking action on those characteristics. I can of course seek out new music based on those characteristics, but I can also book the ideal DJs for my birthday party, create or join a community of music lovers with similar taste in music, or even delete any music from my library that doesn’t match those characteristics. Maybe the only action I would take is self-reflection, and see if what the data has “told” me is in line with what I think is true about myself.

It is possible to turn data into answers, and turn data into doing, with caution and attention to all the ways that bias can be introduced into the data analysis process. But there’s still one more way that data analysis could result in biased outcomes: communicating results. 

Carefully communicate data findings

After I find the answers in my data, I need to carefully communicate them to avoid bias. If I want to tell all my friends that I figured out what kind of music I like to listen to, I want to make sure that I’m telling them that carefully so that they can take the appropriate and ethical action in response to what I tell them. 

I’ll want to present the answers in context. I need to describe the findings with the relevant qualifiers: I like music with these specific characteristics, and when I say I like this music I mean this is the kind of music that I listen to while doing things I enjoy, like working out, writing, or sitting on my couch. 

I also need to make clear what kind of action might be appropriate or ethical to take in reaction to this information. Maybe I want to find more music that has these characteristics, or I’d like to expand my taste, or I want to see some live shows and DJ sets that would feature music that has these characteristics. Actions that support those ends would be appropriate, but can also risk being unethical. What if someone learns of these characteristics, and chooses to then charge me more money than other people (whose taste in music is unknown) to see specific DJ sets or concerts featuring music with those characteristics? 

Data, per the white paper, “must be brought not only to every action and decision, but to every department.” Because of that, it’s important to consider how that happens. Share relevant parts of the process that led to the answers you found from the data. Communicate the results in a way that can be easily understood by your audience. This Medium post by Cecelia Shao, a product manager at Comet.ml, covers important points about how to communicate the results of data analysis. 

Use data for good

I wanted to talk through the data analysis process in the context of the rebranded slogans and marketing content so that I could unpack additional nuance that marketing content can’t convey. I know how easy it is to introduce bias into data analysis, and how easily data analysis can be applied to unethical questions, or used to take unethical actions.

As the white paper aptly points out, the value of data is not merely in having it, but in how you use it to create positive outcomes. You need to be sure you’re using data safely and intelligently, because with great access to data comes great responsibility. 

Go forth and use the data-to-everything platform to turn data into doing…the right thing. 

Disclosure: I work for Splunk. Thanks to my colleagues Chris Gales, Erica Chen, and Richard Brewer-Hay for the feedback on drafts of this post. While colleagues reviewed this post and provided feedback, the content is my own and represents my own views rather than those of Splunk the company. 

Detailed data types you can use for documentation prioritization

Data analysis is a valuable way to learn more about what documentation tasks to prioritize above others. My post (and talk), Just Add Data, presented at Write the Docs Portland in 2019, talk about this broadly. In this post I want to cover in detail a number of different data types that can lead to valuable insights for prioritization.

This list of data types is long, but I promise each one contains value for a technical writer. These types of data might come from your own collection, a user research organization, the business development department, marketing organization, or product management organization:

  • User research reports
  • Support cases
  • Forum threads and questions
  • Product usage metrics
  • Search strings
  • Tags on bugs or issues
  • Education/training course content and questions
  • Customer satisfaction survey

More documentation-specific data types:

  • Documentation feedback
  • Site metrics
  • Text analysis metrics
  • Download/last accessed numbers
  • Topic type metrics
  • Topic metadata
  • Contribution data
  • Social media analytics

Many of these data types are best used in combination with others.

User research reports

User research reports can contain a lot of valuable data that you can use for documentation. 

  • Types of customers being interviewed
  • Customer use cases and problems
  • Types of studies being performed

This can give you insight into both what the company finds valuable to study (so some insight into internal priorities) but also direct customer feedback about things that are confusing or the ways that they use the product. The types of customers that are interviewed can provide valuable audience or persona-targeting information, allowing you to better calibrate the information in your documentation. See How to use data in user research when you have no web analytics on the Gov.UK site for more details about what you can do with user research data.

Support cases

Support cases can help you better understand customer problems. Specific metrics include:

  • Number of cases
  • Frequency of cases
  • Categories of questions
  • Customer environments and licenses

With these you can compile metrics about specific customer problems, the frequency of problems, and the types of customers and customer environments that are encountering specific problems, allowing you to better understand target customers, or customers that might be using your documentation more than others. Support cases are also rich data for common customer problems, providing a good way to gather new use cases and subjects for topics. 

Forum threads and questions

These can be internal forums (like Splunk Answers for Splunk) or external ones, like Reddit or StackOverflow.

  • Common questions
  • Common categories
  • Frequently unanswered questions
  • Post titles

If you’re trying to understand what people are struggling with, or get a better sense of how people are using specific functionality, forum threads can help you understand. The types of questions that people ask and how they phrase them can also help make it clear what kinds of configuration combinations might make specific functions harder for customers. Based on the question types and frequencies that you see, you might be able to fine-tune existing documentation to make it more user-centric and easily findable, or supplement content with additional specific examples. 

Product usage metrics

Some examples of product usage metrics are as follows:

  • Time in product
  • Intra-product clicks
  • Types of data ingested
  • Types of content created
  • Amount of content created

Even if you don’t have specific usage data introspecting the product, you can gather metrics about how people are interacting with the purchase and activation process, and extrapolate accordingly.

  • Number of downloads and installs
  • License activations and types
  • Daily and monthly active users

You can use this type of data to better understand how people are spending their time in your product, and what features or functionality they’re using. Even if a customer has purchased or installed the product, it’s even more valuable to find out if they’re actually using it, and if so, how.

If your product is only in beta, and you want more data to help you prioritize an overall documentation backlog, such as topics that are tied to a specific release, you can use some product usage data to understand where people are spending more of their time, and draw conclusions about what to prioritize based on that.

Maybe the under-utilized features could use more documentation, or more targeted documentation. Maybe the features themselves need work.  Be careful not to draw overly-simplistic conclusions about the data that you see from product usage metrics. Keep context in mind at all times. 

Search strings

You can gather search strings from HTTP referer data from web searches performed on external search sites such as Google or DuckDuckGo, or from internal search services. It’s pretty unlikely that you’ll be able to gather search strings from external sites given the widespread implementation of HTTPS, but internal search services can be vital and valuable data sources for this.

Look at specific search strings to find out what people are looking for, and what people are searching that’s landing them on specific documentation pages. Maybe they’re searching for something and landing on the wrong page, and you can update your topic titles to help.

JIRA or issue data

You can use metrics from your issue tracking services to better understand product quality, as well as customer confusion.

  • Number of issues/bugs
  • Categories/tags/components of issues/bugs
  • Frequency of different types of issues being created/closed

Issue tags or bug components can help you identify categories of the product where there are lots of problems or perhaps customer confusion. This is especially useful data if you’re an open source product and want to get a good understanding of where there are issues that might need more decision support or guidance in the documentation. 

Training courses

If you have an education department, or produce training courses about your product, these are quite useful to gather data from. Some examples of data you might find useful:

  • Questions asked by customers
  • Questions asked by course developers
  • Use cases covered by content in courses
  • Enrollment in courses
  • Categories of courses offered

Also useful to correlate this with other data to help identify verticals of customers interested in different topics. Because education and training courses cover more hands-on material, it can be an excellent source of use case examples, as well as occasions where decision support and guidance is needed. 

Customer surveys

Customer surveys especially cover surveys like satisfaction surveys and sentiment analysis surveys. By reviewing the qualitative statements and types of questions asked in the surveys, you can gain valuable insights and information like:

  • What do people think about the product?
  • What do people want more help with?
  • How do people think about the product?
  • How do people feel about the product?
  • What does the company want to know from customers? 
  • What are the company priorities?

This can also help you think about how the documentation you write has a real effect on peoples’ interactions with the product, and can shift sentiment in one way or another.

Documentation feedback

Direct feedback on your documentation is a vital source of data if you can get it. 

  • Qualitative comments about the documentation
  • Usefulness votes (yes/no)
  • Ratings

Even if you don’t have a direct feedback mechanism on your website, you can collect documentation feedback from internal and external customers by paying attention in conversations with people and even asking them directly if they have any documentation feedback. Qualitative comments and direct feedback can be vital for making improvements to specific areas. 

Site metrics

If your documentation is on a website, you can use web access logs to gather important site metrics, such as the following:

  • Page views
  • Session data like time on page
  • Referer data
  • Link clicks
  • Button clicks
  • Bounce rate
  • Client IP

Site metrics like page views, session data, referer data, and link clicks can help you understand where people are coming to your docs from, how long they are staying on the page, how many readers there are, and where they’re going after they get to a topic. You can also use this data to understand better how people interact with your documentation. Are readers using a version switcher on your page? Are they expanding or collapsing information sections on the page to learn more? Maybe readers are using a table of contents to skip to specific parts of specific topics.  

You can split this data by IP address to understand groups of topics that specific users are clustering around, to better understand how people use the documentation.

Text analysis metrics

Data about the actual text on your documentation site is also useful to help understand the complexity of the documentation on your site.

  • Flesch-Kincaid readability score
  • Inclusivity level
  • Length of sentences and headers
  • Style linter

You can assess the readability or usability of the documentation, or even the grade level score for the content to understand how consistent your documentation is. Identify the length of sentences and headers to see if they match best practices in the industry for writing on the web. You can even scan content against a style linter to identify inconsistencies of documentation topics against a style guide.

Download metrics

If you don’t have site metrics for your documentation site, because the documentation is published only via PDF or another medium, you can still use metrics from that. 

  • Download numbers 
  • Download dates and times
  • Download categories and types

You can use these metrics to gather interest about what people want to be reading offline, or how frequently people are accessing your documentation. You can also correlate this data with product usage data and release cycles to determine how frequently people access the documentation compared with release dates, and the number of people accessing the documentation compared with the number of people using a product or service.

Topic type metrics

If you use strict topic typing at your documentation organization, you can use topic type metrics as an additional metadata layer for documentation data analysis. Even if you don’t, you can manually categorize organize your documentation by type to gather this data.

  • What are the topic types?
  • How many topic types are there?
  • How many topics are there of each type?

Understanding topic types can help you understand how reader interaction patterns can vary for your documentation by type, or whether your developer documentation has predominantly different types of documentation compared with your user documentation, and better understand what types of documentation are written for which audiences.

Topic metadata

Metadata about documentation topics is also incredibly valuable as a correlation data source. You can correlate topic metadata like the following information:

  • What are the titles?
  • Average length of a topic?
  • Last updated and creation dates
  • Versions that different topics apply for

You can correlate it with site metrics, to see if longer topics are viewed less-frequently than shorter topics, or identify outliers in those data points. You can also manually analyze the topic titles to identify if there are patterns (good or bad) that exist.

Contribution data

If you have information about who is writing documentation, and when, you can use these types of data:

  • Last updated dates
  • Authors/contributors
  • Amount of information added or removed

Contribution data can tell you how frequently specific topics were updated to add new information, and by whom, and how much information was added or removed. You can identify frequency patterns, clusters over time, as well as consistent contributors.

It’s useful to split this data by other features, or correlate it with other metrics, especially site metrics. You can then identify things like:

  • Last updated dates by topic
  • Last updated dates by product
  • Last updated dates over time

to see if there are correlations between updates and page views. Perhaps more frequently updated content is viewed more often.

Social media analytics

  • Social media referers
  • Link clicks from social media sites

If you publicize your documentation using social media, you can track the interest in the documentation from those sites. If you’re curious about social media referers leading people to your documentation, and see whether or not people are getting to your documentation in that way. Maybe your support team is responding to people on twitter with links to your documentation, and you want to better understand how frequently that happens and how frequently people click through those links to the documentation…

You can also identify whether or not, and how, people are sharing your documentation on social media by using data crawled or retrieved from those sites’ APIs, and looking for instances of links to your documentation. This can help you get a better sense of how people are using your documentation, how they’re talking about it, how they feel about it, and whether or not you have an organic community out there on the web sharing your documentation. 

Beyond documentation data

I hope that this detail has given you a better understanding of different types of data, beyond documentation data, that are available to you as a technical writer to draw valuable conclusions from. By analyzing these types of data, you are prepared for prioritizing your documentation task list, but also better able to understand the customers of your product and documentation. Even if only some of these are available to you, I hope they are useful. Be sure to read Just Add Data: Using data to prioritize your documentation for the full explanation of how to use data in this way. 

Just Add Data: Using data to prioritize your documentation

This is a blog post adaptation of a talk I gave at Write the Docs Portland on May 21, 2019. The talk was livestreamed and recorded, and you can view the recording on YouTube: Just Add Data: Make it easier to prioritize your documentation – Sarah Moir

Prioritizing documentation is hard. How do you decide what to work on if there isn’t a deadline looming? How do you decide what not to work on when your list of work just keeps growing? How do you identify what new content you might want to add to your documentation?

By adding data to the process, it’s possible to prioritize your documentation tasks with confidence!

Prioritizing without data

Prioritizing a backlog without data can involve asking yourself some questions, like what will take the least amount of time? Or, what did someone most recently request? If I’m doing this, I might ask my product manager what to work on, or do whatever task seems easiest at the time. I might even focus on whichever task I can complete without talking to other people, because I’m tired. 

Based on the answers to those questions, I’ll end up with a prioritized backlog, but lack confidence that what I’ve chosen to work on will actually bring the most value to customers and the documentation. Especially if I’m choosing not to do work, it can be a challenge to keep ignoring an item in the backlog because it doesn’t fit with what I think I need to be working on, especially without some sort of “proof” that it’s okay to ignore. To make this process easier, I add data.

Why prioritize with data?

Using data to prioritize a documentation backlog can help give you more confidence in your decisions and help you justify why you’re not working on something. It can challenge your assumptions about what you should be working on, or validate them. Adding data can help improve your overall understanding of how customers are using your product and the documentation, leading to benefits beyond the backlog.

Data types for prioritization

What kinds of data am I talking about? All kinds of data! If you skim the following list, you’ll notice that this data goes beyond quantitative sources. When I talk about data, I’m including all kinds of information: qualitative comments, usage metrics, metadata, website access logs, survey results, database records, all of these and more fit in with my definition of data. Here’s the full list

  • User research reports
  • Support cases
  • Forum threads and questions
  • Product usage metrics
  • Search strings
  • Tags on bugs or issues
  • Education/training course content and questions
  • Customer satisfaction survey
  • Documentation feedback
  • Site metrics
  • Text analysis metrics
  • Download/last accessed numbers
  • Topic type metrics
  • Topic metadata
  • Contribution data
  • Social media analytics

Some of these data types are more relevant to different types of organizations and documentation installations. For example, open source projects might have more useful issue tags, or organizations that use DITA will have easier access to topic type information.

This list of data types is to demonstrate the different types of information that can help you prioritize documentation, but I don’t want you to think that you need to do large-scale collections or implementations to get any valuable data worth incorporating into your prioritization process.

I’ll cover a couple of these data types in more detail here, but I talk about all of them in another post: Detailed data types you can use for documentation prioritization.

Product usage data

You can use usage data for products (also called telemetry) to find out where people are spending their time. What features or functionality are they using? Even if they’ve purchased or installed the product, are they actually using it?

Some examples of product usage data include:

  • Time in product
  • Intra-product clicks
  • Types of data ingested
  • Types of content created (e.g., dashboards, playlists)
  • Amount of content created (e.g., dashboards, playlists)

In addition to data about how people are interacting with the product, you can also gather product usage data without actual introspection into how people are using it. If you have information about how many people have downloaded a product or are logging in to a service:

  • Number of downloads and installs
  • License activations and types
  • Daily and monthly active users

I mostly talk about using data to help you prioritize the more ambiguous parts of a backlog that might not be tied to a release, but especially with the help of product usage data, you can better-prioritize release-focused documentation as well. If your product is in beta, and you want more data to help you prioritize your overall documentation backlog, you can use some product usage data to understand where people are spending more of their time, and draw conclusions about what to spend more time on or less time on, or what level of detail to include in the documentation, to achieve your overall documentation goals for the release. 

Site metrics

Site metrics like page views, session data, HTTP referer data, and link clicks can help you understand where people are coming to your docs from, how long they’re staying on the page, how many readers there are, and what they’re doing after they get to a topic. Here are some example site metrics:

  • Page views
  • Session data like time on page
  • Referer data
  • Link clicks
  • Button clicks
  • Bounce rate
  • Client IP

You can also use this data to understand better how people interact with your documentation, like whether they’re using a version switcher on your page or expanding/collapsing more information hidden on the page. 

You can also split this data by IP address to understand groups of topics that specific users are clustering around, to better understand how people use the documentation.

Identify questions based on your backlog

The process of adding data to your documentation prioritization strategy is all about making do with what you have to answer what you want to know. What you want to know depends on your backlog.

Data analysis is focused on a goal. You don’t want to collect a lot of data and then just stare at it, or get stressed out by the amount of “insights” that you could be gathering but meanwhile you’re not really sure what to do with the information. If you consider questions that you want to answer in advance, you can focus your data collection and analysis in a more valuable way. 

Some example questions that you might identify based on your task list:

  • What are people looking for? Are they finding what they’re looking for?
  • Are people looking for information about <thing I’ve been told to document>?
  • What do people want more help with?
  • What people are we targeting that don’t see their use cases represented?

Tie questions to data types

After you’ve identified questions relevant to your task list, you can tie those questions to data types that can help you answer the questions.

For example, the question: What are people looking for and not finding?

To answer this, you can look where people are looking for information, namely search keywords that they’re typing into search engines, common questions being posted on forums, or the topics of support cases filed by customers.

For example, I looked at some data and was able to identify specific search terms people are using on the documentation site that are routing customers to a company-managed forum site.  I can then use that data to identify cases where people are looking for documentation about something, but are not finding the answers in the documentation.

Another example question: What do people want more help with? 

This could be answered by looking at the topics of support cases again, but also the types of questions being asked in training courses, as well as unanswered questions on forums. 

As a final example: What market groups are we targeting that don’t see their use cases represented?

To answer this, you could look at data about sales leads, questions being asked by the field that contain specific use cases for various market verticals, as well as questions being asked in training courses.

Find questions from data

If you don’t have much of a task list to work with, or if you aren’t able to get access to data that can help you answer your questions, you can still make use of the data that is available to you and draw valuable insights from it.

You can identify interest in content that you maybe weren’t aware of, and make plans to write more to address that interest, or modify existing content to address that interest. Maybe there are a bunch of forum threads about how to do something, but nothing authoritative in the documentation. That information hasn’t made it to the docs writers in any way, but because you’re looking at the available data, you’re able to see that it’s important.

Even if you have no data specifically relevant to the documentation or customer questions, you can still find ways to identify documentation work to add to a task list. You could create datasets by performing text analysis on all or specific documentation topics, and identify complexity issues, or topics that don’t adhere to a style guide. You could use customer satisfaction surveys to identify places where documentation architecture or linking strategies could be improved.

Working with the data

Now you hopefully have a better understanding of different types of data available to you, and how you can identify valuable data sources based on your questions that you want to answer. But how much data do you need to collect? And how do you get the data? Most importantly, how do you analyze it to answer the questions you want to answer?

How much data?

How much data do you need to collect? You don’t need to collect data forever. You don’t need ALL the data. You just need enough data to point you in a direction and reduce uncertainty.

You can use a small sample of users, or a small sample of time, so long as it helps you answer your question and reduce uncertainty about what the answer could be. Collecting larger amounts of data doesn’t mean that you reduce uncertainty by an equally large degree. The amount of data you collect doesn’t correlate directly to what you’re able to learn from it. However, if the question you’re trying to answer with data concerns all the documentation users over a long period of time, you will be collecting more data than if you just want to know what a specific subset of readers found interesting on a Friday afternoon.

Try for representative samples that are relevant for the questions you’re trying to answer. If you can’t get representative data, try for a random sample. If you can’t get representative or random samples, acknowledge the bias that is inherent in the data you’re using. Add context to the data wherever possible, especially about who the data represents and why the data is still valuable if it isn’t representative.

You might find that collecting a small amount of data leaves you with more questions than answers, and that’s okay too. It’s an opportunity to continue exploring and learning more about your customers and your documentation tasks. But how do you even get any data at all?

How do you get the data?

You’ll either be collecting your own data, or asking others for the data you need.

If it’s data about the documentation site or its content, you might own that data yourself, and already have access to it. If it’s other types of data, like sales leads or user research data, it’s time to talk to the departments or people that manage those areas.

  • A business development department might have reporting on internal tools like sales leads or support cases.
  • Product managers can share direct customer data and product usage data if you don’t have direct access.
  • Project managers can share data related to internal development processes.

The teams managing different datasets will vary at your organization, and might even be you in many cases. They may be reluctant to share data. With that in mind, remember that when you collect data, you don’t need to get persistent access to all the data you want. Focus on getting some access to some data that is useful to answer your questions. After that, you can use that data to make your work more efficient and informed, and then hopefully communicate that value and get more access to data in the future if you want.

What to use for data analysis?

What do you use to analyze that data after you get it? How do you transform data into a report of useful information?

Some tools might already have analytics and reporting built in, like Google Analytics. That can certainly make it easier to analyze the data!

For other types of data that you need to analyze yourself, use the tools available to you. Think about what already know how to use, or have access to:

  • Know how to use Excel? Perfect! Get started collecting and processing data in spreadsheets and with macros.
  • Know how to write scripts in R/Python to analyze data? Great! You can write scripts to collect, process, and visualize this data.
  • Is your organization using a tool like Splunk, ElasticSearch, Tableau, etc.? Good news! You are really ready for data analysis.

You don’t have to spend a long time learning a new tool to analyze data for these purposes. If you continue incorporating data analysis into your work, it might make sense, but it isn’t necessary to get started.

Tools aren’t magic

It’s also important to note that tools aren’t magic. Some degree of data analysis will involve manual collecting, categorizing, or cleaning of the data. If your organization doesn’t have strict topic types, you might need to perform manual topic-typing. If you want to analyze some information but the data isn’t in a machine-readable format, you might have to sit at your desk copy pasting for hours.

Depending on your skills, the current state of the data that you want to analyze, and the tools available to you, the amount of time it takes to analyze data and get results can vary widely. I have spent 3 days manually processing data in Excel, and I’ve spent 2 hours creating searches in mostly-clean datasets in Splunk to get answers to various questions. Keep that in mind when you’re analyzing data.

How to perform data analysis

When you analyze data, what are you actually looking at? 

Top, rare, outlying values

Find out what values are most common, and which values are least common. Those can be established by counting the various instances of values.

Look for values that are different from the others by a large margin. You can use standard deviation as a function to achieve this.

Patterns and clusters across data

You can also look for patterns and clusters in your data.

If you’re working with qualitative data, you might need to categorize, or code, the data so that you can sort it and look for patterns in the results. You can identify these patterns by counting instances of categories, or looking at clusters of behavior. An example of a cluster of behavior is if you look at documentation topic visits over time, and you identify a spike in visits at a particular time.

Split by different features

You also want to segment data by different features. Meaning, you can better understand the most common values if you split them by other types of information. For example, you can look at the most commonly visited topics in your documentation set over the last 3 months, or you can look at the most commonly visited topics in your documentation over the last 3 months, but on a week-to-week basis. That additional split can help you understand how those values are changing over time. If you identify a spike in a particular topic or category of topics, you can then interpret the data. Maybe a new product release led to a spike of interest in the release notes topic that wasn’t easily identified until you split the results by week. This is also a good opportunity to point out to a product team that people really do read your documentation!

That’s an example of splitting by time, but you can split by any other field available to you in your data. To use the same data type, looking at the most common topics by product, by IP address, or other factors, can help lead to valuable insights.

Combine data types

You can combine different types of data to understand approximately how many people are using the product vs how many of them are using the documentation. Comparing sales leads, product usage data, and existing page views could help you approximate the number of potential, and existing customers, alongside the number of distinct documentation readers.

Make sure that when you combine data across datasets, you keep track of units and time ranges, and make sure that you compare like data with like data. For example, be careful not to use data that refers to potential customers with data that refers to existing customers, because that could lead to misleading results if you don’t keep context with the data.

Interpreting results

When you interpret the results of your data analysis, make sure that you are adding context to the data. Especially when dealing with outlier data, but even when reviewing data like rarely-viewed or frequently-viewed topics, keep in mind additional context that could explain results.

Add context from expertise

Use your expertise and knowledge of the documentation to add context. For example, topics concerning a specific functionality are likely to be more popular at a specific time if that functionality was recently changed.

Pursue alternate explanations

Whenever you’re interpreting data, you want to make sure that you’re gut-checking it against what you already know. So if a relatively mundane topic has wildly out-of-the-ordinary page views, there are likely alternate explanations for that interest. Maybe your topic ended up being a great resource about cron syntax in general, even for people that don’t use your product.

Draw realistic conclusions

Draw realistic conclusions based on the data available to you. You might not be able to get access to or combine specific datasets due to privacy concerns. If you carefully identify what problems you’re trying to solve, and select only the data sources that can help you solve those problems, you can reduce the potential that you’ll introduce bias into your data analysis, and improve the conclusions that you’re able to draw.

Don’t trust data blindly

Don’t trust the data blindly. When reviewing data that seems out of the ordinary or like outliers, examine the different reasons why the data could be like that. Who does the data represent? What does it represent? Make sure that you’re interpreting data in context, so that you’re able to understand exactly what it represents. It can be tempting to ignore data that doesn’t match your biases or expectations.

Above all, remember to use data to complement your research and writing, and validate or challenge assumptions about your audience.

Your turn to add data

  1. Identify the questions you’re trying to answer
  2. Use the data available to you
  3. Use the tools available to you
  4. Analyze and interpret the data
  5. Take action and prioritize accordingly

Additional resources

The Concepts Behind the Book: How to Measure Anything

I just finished reading How to Measure Anything: Finding the Value of Intangibles in Business by Douglas Hubbard. It discusses fascinating concepts about measurement and observability, but they are tendrils that you must follow among mentions of Excel, statistical formulas, and somewhat dry consulting anecdotes. For those of you that might want to focus mainly on the concepts rather than the literal statistics and formulas behind implementing his framework, I wanted to share the concepts that resonated with me. If you want to read a more thorough summary, I recommend the summary on Less Wrong, also titled How to Measure Anything.

The premise of the book is that people undertake many business decisions and large projects with the idea that success of the decisions or projects can’t be measured, and thus they aren’t measured. It seems a large waste of money and effort if you can’t measure the success of such projects and decisions, and so he developed a consulting business and a framework, Applied Information Economics (AIE)to prove that you can measure such things.

Near the end of his book on page 267, he summarizes his philosophy as six main points:

1. If it’s really that important, it’s something you can define. If it’s something you think exists at all, then it’s something that you’ve already observed somehow.

2. If it’s something important and something uncertain, then you have a cost of being wrong and a chance of being wrong.

3. You can quantify your current uncertainty with calibrated estimates.

4. You can compute the value of additional information by knowing the “threshold” of the measurement where it begins to make a difference compared to your existing uncertainty.

5. Once you know what it’s worth to measure something, you can put the measurement effort in context and decide on the effort it should take.

6. Knowing just a few methods for random sampling, controlled experiments, or even just improving on the judgment of experts can lead to a significant reduction in uncertainty.

To restate those points:

  1. Define what you want to know. Consider ways that you or others have measured similar problems. What you want to know might be easier to see than you thought.
  2. It’s valuable to measure things that you aren’t certain about if they are important to be certain about.
  3. Make estimates about what you think will happen, and calibrate those estimates to understand just how uncertain you are about outcomes.
  4. Determine a level of certainty that will help you feel more confident about a decision. Additionally, determine how much information will be needed to get you there.
  5. Determine how much effort it might take to gather that information.
  6. Understand that it probably takes less effort than you think to reduce uncertainty.

The crux of the book revolves around restating measurement from “answer a specific question” to “reduce uncertainty based on what you know today”.

Measure to reduce uncertainty

Before reading this book, I thought about data analysis as a way to find an answer to a question. I’d go in with a question, I’d find data, and thanks to that data, I’d magically know the answer. However, that approach only works with specifically-defined questions and perfect data. If I want to know “how many views did a specific documentation topic get last week” I can answer that straightforwardly with website metrics.

However, if I want to know “Was the guidance about how to perform a task more useful after I rewrote it?” there was really no way to know the answer to that question. Or so I thought.

Hubbard’s book makes the crucial distinction that data doesn’t need to exist to directly answer that question. It merely needs to make you more certain of the likely answer. You can make a guess about whether or not it was useful, carefully calibrating your guess based on your knowledge of similar scenarios, and then perform data analysis or measurement to improve the accuracy of your guess. If you’re not very certain of the answer, it doesn’t take much data or measurement to make you more certain, and thus increase your confidence in an outcome. However, the more certain you are, the more measurement you need to perform to increase your certainty.

Start by decomposing the problem

If you think what you want to measure isn’t measurable, Hubbard encourages you to think again, and decompose the problem. To use my example, and #1 on his list, I want to measure whether or not a documentation topic was more useful after I rewrote it. As he points out with his first point, the problem is likely more observable than I might think at first.

“Decompose the measurement so that it can be estimated from other measurements. Some of these elements may be easier to measure and sometimes the decomposition itself will have reduced uncertainty.”

I can decompose the question that I’m trying to answer, and consider how I might measure usefulness of a topic. Maybe something is more useful if it is viewed more often, or if people are sharing the link to the topic more frequently, or if there are qualitative comments in surveys or forums that refer to it. I can think about how I might tell someone that a topic is useful, what factors of the topic and information about it I might point to. Does it come up first when you search for a specific customer question? Maybe then search rankings for relevant keywords are an observable metric that could help me measure utility of a topic.

You can also perform extra research to think of ways to measure something.

“Consider your findings from secondary research: Look at how others measured similar issues. Even if their specific findings don’t relate to your measurement problem, is there anything you can salvage from the methods they used?”

Is it business critical to measure this?

Before I invest a lot of time and energy performing measurements, I want to make sure (to Hubbard’s second point in his list) that the question I am attempting to answer, what I am trying to measure, is important enough to merit measurement. This is also tied to points four, five, and six: does the importance of the knowledge outweigh the difficulty of the measurement? It often does, especially because (to his sixth point), the measurement is often easier to obtain than it might seem at first.

Estimate what you think you’ll measure

To Hubbard’s third point, a calibrated estimate is important when you do a measurement. I need to be able to estimate what “success” might look like, and what reasonable bounds of success I might expect are.

Make estimates about what you think will happen, and calibrate those estimates to understand just how uncertain you are about outcomes.

To continue with my question about a rewritten topic’s usefulness, let’s say that I’ve determined that added page views, elevated search rankings, and link shares on social media will mean the project is a success. I’d then want to estimate what number of each of those measurements might be meaningful.

To use page views as an example for estimation, If page views increase by 1%, it might not be meaningful. But maybe 5% is a meaningful increase? I can use that as a lower bound for my estimate. I can also think about a likely upper bound. A 1000% increase would be unreasonable, but maybe I could hope that page views would double, and I’d see a 100% increase in page views! I can use that as an upper bound. By considering and dismissing the 1% and 1000% numbers, I’m also doing some calibration of my estimates—essentially gut checking them with my expertise and existing knowledge. The summary of How to Measure Anything that I linked in the first paragraph addresses calibration of estimates in more detail, as does the book itself!

After I’ve settled on a range of measurement outcomes, I can assess how confident I am that this might happen. Hubbard calls this a Confidence Interval. I might only be 60% certain that page views will increase by at least 5% but they won’t increase more than 100%. This gives me a lot of uncertainty to reduce when I start measuring page views.

One way to start reducing my uncertainty about these percentage increases might be to look at the past page views of this topic, to try to understand what regular fluctuation in page views might be over time. I can look at the past 3 months, week by week, and might discover that 5% is too low to be meaningful, and a more reasonable signifier of success would be a 10% or higher increase in page views.

Estimating gives me a number that I am attempting to reduce uncertainty about, and performing that initial historical measurement can already help me reduce some uncertainty. Now I can be 100% certain that a successful change to the topic should show more than 5% page views on a week-to-week basis, and maybe am 80% certain that a successful change would show 10% or more page views.

When doing this, keep in mind another point of Hubbards:

“a persistent misconception is that unless a measurement meets an arbitrary standard….it has no value….what really makes a measurement of high value is a lot of uncertainty combined with a high cost of being wrong.”

If you’re choosing to undertake a large-scale project that will cost quite a bit if you get it wrong, you likely want to know in advance how to measure the success of that project. This point also underscores his continued emphasis on reducing uncertainty.

For my (admittedly mild) example, it isn’t valuable for me to declare that I can’t learn anything from page view data unless  3 months have passed. I can likely reduce uncertainty enough with two weeks of data to learn something valuable, especially if my uncertainty level is in relatively low (in this example, in the 40-70% range).

Measure just enough, not a lot

Hubbard talks about the notion of a Rule of Five:

There is a 93.75% chance that the median of a population is between the smallest and largest values in any random sample of five from that population.

Knowing the median value of a population can go a long way in reducing uncertainty. Even if you can only get a seemingly-tiny sample of data, this rule of five makes it clear that even that small sample can be incredibly valuable for reducing uncertainty about a likely value. You don’t have to know all of something to know something important about it.

Do something with what you’ve learned

After you perform measurements or do some data analysis and reduce your uncertainty, then it’s time to do something with what you’ve learned. Given my example, maybe my rewrite increased page views of the topic by 20%, something I’m now fairly certain is a significant degree, and it is now higher in the search results. I’ve now sufficiently reduced my uncertainty about whether or not the changes made this topic more useful, and I can now rewrite similar topics to use a similar content pattern with confidence. Or at least, more confidence than I had before.

Overall summary

My super abbreviated summary of the book would then be to do the following:

  1. Start by decomposing the problem
  2. Ask is it business critical to measure this?
  3. Estimate what you think you’ll measure
  4. Measure just enough, not a lot
  5. Do something with what you’ve learned

I recommend the book (with judicious skimming), especially if you need some conceptual discussion to help you unravel how best to measure a specific problem. As I read the book, I took numerous notes about how I might be able to measure something like support case deflection with documentation, or how to prioritize new features for product development (or documentation). I also considered how customers might better be able to identify valuable data sources for measuring security posture or other events in their data if they followed many of the practices outlined in this book.

Planning and analyzing my concert attendance with Splunk

This past year I added some additional datasets to the Splunk environment I use to analyze my music: information about tickets that I’ve purchased, and information about upcoming concerts.

Ticket purchase analysis

I started keeping track of the tickets that I’ve purchased over the years, which gave me good insights about ticket fees associated with specific ticket sites and concert promoters.  

Based on the data that I’ve accumulated so far, Ticketmaster doesn’t have the highest fees for concert tickets. Instead, Live Nation does. This distinction is relatively meaningless when you realize they’ve been the same company since 2010.

However, the ticket site isn’t the strongest indicator of fees, so I decided to split the data further by promoter to identify if specific promoters had higher fees than others.

Based on that data you can see that the one show I went to promoted by AT&T had fee percentages of nearly 37%, and that shows promoted by Live Nation (through their evolution and purchase by Ticketmaster) also had fees around 26%. Shows promoted by independent venues have somewhat higher fees than others, hovering around 25% for 1015 Folsom and Mezzanine, but shows promoted by organizations whose only purpose is promotion tend to have slightly lower fees, such as select entertainment with 18%, Popscene with 16.67%, and KC Turner Presents with 15.57%.

I realized I might want to refine this, so I recalculated this data, limiting it to promoters from which I’ve bought at least two tickets.

It’s a much more even spread in this case, ranging from 25% to 11% in fees. However, you can see that the same patterns exist— for the shows I’ve bought tickets to, the independent venues average 22-25% in fees, while dedicated independent promoters are 16% or less in added fees, with corporate promoters like Another Planet, JAM, and Goldenvoice filling the middle of the data ranging from 18% to 22%.

I also attempted to determine how I’m discovering concerts. This data is entirely reliant on my memory, with no other data to back it up, but it’s pretty fascinating to track.

It’s clear that Songkick has become a vital service in my concert-going planning, helping me discover 46 shows, and friends and email newsletters from venues helping me stay in the know as well for 19 and 14 shows respectively. Social media contributes as well, with a Facebook community (raptors) and Instagram making appearances with 10 and 2 discoveries respectively.

Concert data from Songkick

Because Songkick is so vital to my concert discovery, I wanted to amplify the information I get from the service. In addition to tracking artists on the site, I wanted to proactively gather information about artists coming to the SF Bay Area and compare that with my listening habits. To do this, I wrote a Songkick alert action in Python to run in Splunk.

Songkick does an excellent job for the artists that I’m already tracking, but there are some artists that I might have just recently discovered but am not yet tracking. To reduce the likelihood of missing fast-approaching concerts for these newly-discovered artists, I set up an alert to look for concerts for artists that I’ve discovered this year and have listened to at least 5 times.

To make sure I’m also catching other artists I care about, I use another alert to call the Songkick API for every artist that is above a calculated threshold. That threshold is based on the average listens for all artists that I’ve seen live, so this search helps me catch approaching concerts for my historical favorite artists.

Also to be honest, I also did this largely so that I could learn how to write an alert action in Splunk software. Alert actions are essentially bits of custom python code that you can dispatch with the results of a search in Splunk. The two alert examples I gave are both saved searches that run every day and update an index. I built a dashboard to visualize the results.

I wanted to use log data to confirm which artists were being sent to Songkick with my API request, even if no events were returned. To do this I added a logging statement in my Python code for the alert action, and then visualized the log statements (with the help of a lookup to match the artist_mbid with the artist name) to display the artists that had no upcoming concerts at all, or had no SF concerts.

For those artists without concerts in the San Francisco Bay Area, I wanted to know where they were going instead, so that I could identify possible travel locations for the future.

It seems like Paris is the place to be for several of these artists—there might be a festival that LAUER, Max Cooper, George Fitzgerald, and Gerald Toto are all playing at, or they just happen to all be visiting that city on their tours.

I’m planning to publish a more detailed blog post about the alert action code in the future on the Splunk blogs site, but until then I’ll be off looking up concert tickets to these upcoming shows….