Just Add Data: Using data to prioritize your documentation

This is a blog post adaptation of a talk I gave at Write the Docs Portland on May 21, 2019. The talk was livestreamed and recorded, and you can view the recording on YouTube: Just Add Data: Make it easier to prioritize your documentation – Sarah Moir

Prioritizing documentation is hard. How do you decide what to work on if there isn’t a deadline looming? How do you decide what not to work on when your list of work just keeps growing? How do you identify what new content you might want to add to your documentation?

By adding data to the process, it’s possible to prioritize your documentation tasks with confidence!

Prioritizing without data

Prioritizing a backlog without data can involve asking yourself some questions, like what will take the least amount of time? Or, what did someone most recently request? If I’m doing this, I might ask my product manager what to work on, or do whatever task seems easiest at the time. I might even focus on whichever task I can complete without talking to other people, because I’m tired. 

Based on the answers to those questions, I’ll end up with a prioritized backlog, but lack confidence that what I’ve chosen to work on will actually bring the most value to customers and the documentation. Especially if I’m choosing not to do work, it can be a challenge to keep ignoring an item in the backlog because it doesn’t fit with what I think I need to be working on, especially without some sort of “proof” that it’s okay to ignore. To make this process easier, I add data.

Why prioritize with data?

Using data to prioritize a documentation backlog can help give you more confidence in your decisions and help you justify why you’re not working on something. It can challenge your assumptions about what you should be working on, or validate them. Adding data can help improve your overall understanding of how customers are using your product and the documentation, leading to benefits beyond the backlog.

Data types for prioritization

What kinds of data am I talking about? All kinds of data! If you skim the following list, you’ll notice that this data goes beyond quantitative sources. When I talk about data, I’m including all kinds of information: qualitative comments, usage metrics, metadata, website access logs, survey results, database records, all of these and more fit in with my definition of data. Here’s the full list

  • User research reports
  • Support cases
  • Forum threads and questions
  • Product usage metrics
  • Search strings
  • Tags on bugs or issues
  • Education/training course content and questions
  • Customer satisfaction survey
  • Documentation feedback
  • Site metrics
  • Text analysis metrics
  • Download/last accessed numbers
  • Topic type metrics
  • Topic metadata
  • Contribution data
  • Social media analytics

Some of these data types are more relevant to different types of organizations and documentation installations. For example, open source projects might have more useful issue tags, or organizations that use DITA will have easier access to topic type information.

This list of data types is to demonstrate the different types of information that can help you prioritize documentation, but I don’t want you to think that you need to do large-scale collections or implementations to get any valuable data worth incorporating into your prioritization process.

I’ll cover a couple of these data types in more detail here, but I talk about all of them in another post: Detailed data types you can use for documentation prioritization.

Product usage data

You can use usage data for products (also called telemetry) to find out where people are spending their time. What features or functionality are they using? Even if they’ve purchased or installed the product, are they actually using it?

Some examples of product usage data include:

  • Time in product
  • Intra-product clicks
  • Types of data ingested
  • Types of content created (e.g., dashboards, playlists)
  • Amount of content created (e.g., dashboards, playlists)

In addition to data about how people are interacting with the product, you can also gather product usage data without actual introspection into how people are using it. If you have information about how many people have downloaded a product or are logging in to a service:

  • Number of downloads and installs
  • License activations and types
  • Daily and monthly active users

I mostly talk about using data to help you prioritize the more ambiguous parts of a backlog that might not be tied to a release, but especially with the help of product usage data, you can better-prioritize release-focused documentation as well. If your product is in beta, and you want more data to help you prioritize your overall documentation backlog, you can use some product usage data to understand where people are spending more of their time, and draw conclusions about what to spend more time on or less time on, or what level of detail to include in the documentation, to achieve your overall documentation goals for the release. 

Site metrics

Site metrics like page views, session data, HTTP referer data, and link clicks can help you understand where people are coming to your docs from, how long they’re staying on the page, how many readers there are, and what they’re doing after they get to a topic. Here are some example site metrics:

  • Page views
  • Session data like time on page
  • Referer data
  • Link clicks
  • Button clicks
  • Bounce rate
  • Client IP

You can also use this data to understand better how people interact with your documentation, like whether they’re using a version switcher on your page or expanding/collapsing more information hidden on the page. 

You can also split this data by IP address to understand groups of topics that specific users are clustering around, to better understand how people use the documentation.

Identify questions based on your backlog

The process of adding data to your documentation prioritization strategy is all about making do with what you have to answer what you want to know. What you want to know depends on your backlog.

Data analysis is focused on a goal. You don’t want to collect a lot of data and then just stare at it, or get stressed out by the amount of “insights” that you could be gathering but meanwhile you’re not really sure what to do with the information. If you consider questions that you want to answer in advance, you can focus your data collection and analysis in a more valuable way. 

Some example questions that you might identify based on your task list:

  • What are people looking for? Are they finding what they’re looking for?
  • Are people looking for information about <thing I’ve been told to document>?
  • What do people want more help with?
  • What people are we targeting that don’t see their use cases represented?

Tie questions to data types

After you’ve identified questions relevant to your task list, you can tie those questions to data types that can help you answer the questions.

For example, the question: What are people looking for and not finding?

To answer this, you can look where people are looking for information, namely search keywords that they’re typing into search engines, common questions being posted on forums, or the topics of support cases filed by customers.

For example, I looked at some data and was able to identify specific search terms people are using on the documentation site that are routing customers to a company-managed forum site.  I can then use that data to identify cases where people are looking for documentation about something, but are not finding the answers in the documentation.

Another example question: What do people want more help with? 

This could be answered by looking at the topics of support cases again, but also the types of questions being asked in training courses, as well as unanswered questions on forums. 

As a final example: What market groups are we targeting that don’t see their use cases represented?

To answer this, you could look at data about sales leads, questions being asked by the field that contain specific use cases for various market verticals, as well as questions being asked in training courses.

Find questions from data

If you don’t have much of a task list to work with, or if you aren’t able to get access to data that can help you answer your questions, you can still make use of the data that is available to you and draw valuable insights from it.

You can identify interest in content that you maybe weren’t aware of, and make plans to write more to address that interest, or modify existing content to address that interest. Maybe there are a bunch of forum threads about how to do something, but nothing authoritative in the documentation. That information hasn’t made it to the docs writers in any way, but because you’re looking at the available data, you’re able to see that it’s important.

Even if you have no data specifically relevant to the documentation or customer questions, you can still find ways to identify documentation work to add to a task list. You could create datasets by performing text analysis on all or specific documentation topics, and identify complexity issues, or topics that don’t adhere to a style guide. You could use customer satisfaction surveys to identify places where documentation architecture or linking strategies could be improved.

Working with the data

Now you hopefully have a better understanding of different types of data available to you, and how you can identify valuable data sources based on your questions that you want to answer. But how much data do you need to collect? And how do you get the data? Most importantly, how do you analyze it to answer the questions you want to answer?

How much data?

How much data do you need to collect? You don’t need to collect data forever. You don’t need ALL the data. You just need enough data to point you in a direction and reduce uncertainty.

You can use a small sample of users, or a small sample of time, so long as it helps you answer your question and reduce uncertainty about what the answer could be. Collecting larger amounts of data doesn’t mean that you reduce uncertainty by an equally large degree. The amount of data you collect doesn’t correlate directly to what you’re able to learn from it. However, if the question you’re trying to answer with data concerns all the documentation users over a long period of time, you will be collecting more data than if you just want to know what a specific subset of readers found interesting on a Friday afternoon.

Try for representative samples that are relevant for the questions you’re trying to answer. If you can’t get representative data, try for a random sample. If you can’t get representative or random samples, acknowledge the bias that is inherent in the data you’re using. Add context to the data wherever possible, especially about who the data represents and why the data is still valuable if it isn’t representative.

You might find that collecting a small amount of data leaves you with more questions than answers, and that’s okay too. It’s an opportunity to continue exploring and learning more about your customers and your documentation tasks. But how do you even get any data at all?

How do you get the data?

You’ll either be collecting your own data, or asking others for the data you need.

If it’s data about the documentation site or its content, you might own that data yourself, and already have access to it. If it’s other types of data, like sales leads or user research data, it’s time to talk to the departments or people that manage those areas.

  • A business development department might have reporting on internal tools like sales leads or support cases.
  • Product managers can share direct customer data and product usage data if you don’t have direct access.
  • Project managers can share data related to internal development processes.

The teams managing different datasets will vary at your organization, and might even be you in many cases. They may be reluctant to share data. With that in mind, remember that when you collect data, you don’t need to get persistent access to all the data you want. Focus on getting some access to some data that is useful to answer your questions. After that, you can use that data to make your work more efficient and informed, and then hopefully communicate that value and get more access to data in the future if you want.

What to use for data analysis?

What do you use to analyze that data after you get it? How do you transform data into a report of useful information?

Some tools might already have analytics and reporting built in, like Google Analytics. That can certainly make it easier to analyze the data!

For other types of data that you need to analyze yourself, use the tools available to you. Think about what already know how to use, or have access to:

  • Know how to use Excel? Perfect! Get started collecting and processing data in spreadsheets and with macros.
  • Know how to write scripts in R/Python to analyze data? Great! You can write scripts to collect, process, and visualize this data.
  • Is your organization using a tool like Splunk, ElasticSearch, Tableau, etc.? Good news! You are really ready for data analysis.

You don’t have to spend a long time learning a new tool to analyze data for these purposes. If you continue incorporating data analysis into your work, it might make sense, but it isn’t necessary to get started.

Tools aren’t magic

It’s also important to note that tools aren’t magic. Some degree of data analysis will involve manual collecting, categorizing, or cleaning of the data. If your organization doesn’t have strict topic types, you might need to perform manual topic-typing. If you want to analyze some information but the data isn’t in a machine-readable format, you might have to sit at your desk copy pasting for hours.

Depending on your skills, the current state of the data that you want to analyze, and the tools available to you, the amount of time it takes to analyze data and get results can vary widely. I have spent 3 days manually processing data in Excel, and I’ve spent 2 hours creating searches in mostly-clean datasets in Splunk to get answers to various questions. Keep that in mind when you’re analyzing data.

How to perform data analysis

When you analyze data, what are you actually looking at? 

Top, rare, outlying values

Find out what values are most common, and which values are least common. Those can be established by counting the various instances of values.

Look for values that are different from the others by a large margin. You can use standard deviation as a function to achieve this.

Patterns and clusters across data

You can also look for patterns and clusters in your data.

If you’re working with qualitative data, you might need to categorize, or code, the data so that you can sort it and look for patterns in the results. You can identify these patterns by counting instances of categories, or looking at clusters of behavior. An example of a cluster of behavior is if you look at documentation topic visits over time, and you identify a spike in visits at a particular time.

Split by different features

You also want to segment data by different features. Meaning, you can better understand the most common values if you split them by other types of information. For example, you can look at the most commonly visited topics in your documentation set over the last 3 months, or you can look at the most commonly visited topics in your documentation over the last 3 months, but on a week-to-week basis. That additional split can help you understand how those values are changing over time. If you identify a spike in a particular topic or category of topics, you can then interpret the data. Maybe a new product release led to a spike of interest in the release notes topic that wasn’t easily identified until you split the results by week. This is also a good opportunity to point out to a product team that people really do read your documentation!

That’s an example of splitting by time, but you can split by any other field available to you in your data. To use the same data type, looking at the most common topics by product, by IP address, or other factors, can help lead to valuable insights.

Combine data types

You can combine different types of data to understand approximately how many people are using the product vs how many of them are using the documentation. Comparing sales leads, product usage data, and existing page views could help you approximate the number of potential, and existing customers, alongside the number of distinct documentation readers.

Make sure that when you combine data across datasets, you keep track of units and time ranges, and make sure that you compare like data with like data. For example, be careful not to use data that refers to potential customers with data that refers to existing customers, because that could lead to misleading results if you don’t keep context with the data.

Interpreting results

When you interpret the results of your data analysis, make sure that you are adding context to the data. Especially when dealing with outlier data, but even when reviewing data like rarely-viewed or frequently-viewed topics, keep in mind additional context that could explain results.

Add context from expertise

Use your expertise and knowledge of the documentation to add context. For example, topics concerning a specific functionality are likely to be more popular at a specific time if that functionality was recently changed.

Pursue alternate explanations

Whenever you’re interpreting data, you want to make sure that you’re gut-checking it against what you already know. So if a relatively mundane topic has wildly out-of-the-ordinary page views, there are likely alternate explanations for that interest. Maybe your topic ended up being a great resource about cron syntax in general, even for people that don’t use your product.

Draw realistic conclusions

Draw realistic conclusions based on the data available to you. You might not be able to get access to or combine specific datasets due to privacy concerns. If you carefully identify what problems you’re trying to solve, and select only the data sources that can help you solve those problems, you can reduce the potential that you’ll introduce bias into your data analysis, and improve the conclusions that you’re able to draw.

Don’t trust data blindly

Don’t trust the data blindly. When reviewing data that seems out of the ordinary or like outliers, examine the different reasons why the data could be like that. Who does the data represent? What does it represent? Make sure that you’re interpreting data in context, so that you’re able to understand exactly what it represents. It can be tempting to ignore data that doesn’t match your biases or expectations.

Above all, remember to use data to complement your research and writing, and validate or challenge assumptions about your audience.

Your turn to add data

  1. Identify the questions you’re trying to answer
  2. Use the data available to you
  3. Use the tools available to you
  4. Analyze and interpret the data
  5. Take action and prioritize accordingly

Additional resources

So you want to be a technical writer

If you’re interested in becoming a technical writer, or are new to the field and want to deepen your skills and awareness of the field, this blog post is for you.

What do technical writers actually do?

Technical writers can do a lot of different things! People in technical writing write how-to documentation, craft API reference documentation, create tutorials, even provide user-facing text strings to engineers.

Ultimately, technical writers:

  • Research to learn more about what they are documenting.
  • Perform testing to verify that their documentation is accurate and validate assumptions about the product.
  • Write words that help readers achieve specific learning objectives and that capture what the writer has learned in the research and testing processes.
  • Initiate reviews with engineers, product managers, user experience designers, quality assurance testers, and others to validate the accuracy, relevancy, and utility of the content.
  • Advocate for the customer or whoever uses the product or service being documented.

The people reading what technical writers have produced could be using software they’ve purchased from your company, evaluating a product or service they are considering purchasing, undergoing a required process controlled by your organization, writing code that interfaces with your services, configuring or installing modifying hardware produced by your company, or even reviewing the documentation for compliance and certification purposes. Your goal, if you choose to accept it, is to help them get the information they need and get back to work as soon as possible.

Identify what you want from your career

Some general career-assessment tips:

  • Identify what motivates you and what challenges you.
  • Identify what type of team environment you want. These are loose descriptions of types of team environments that are out there:
    • A large highly-collaborative team with lots of interaction
    • A distributed team that is available for questions and brainstorming as needed, but largely everyone is working on their own thing.
    • A small team that collaborates as needed.
    • A team of one, it’s just you, you are the team.

Is technical writing a good fit for you?

  • Do you enjoy explaining things to other people?
  • Do people frequently ask you to help explain something to them?
  • Do people frequently ask you to help them revise content for them?
  • Do you care or enjoy thinking about how to communicate information?
  • Do you identify when things are inconsistent or unclear and ask people to fix it? (Such as in a UI implementation, or when reviewing a pull request)
  • Do you enjoy problem-solving and communication?
  • Do you like synthesizing information from disparate sources, from people to product to code to internal documentation?
  • Do you enjoy writing?

My background and introduction to technical writing

I started in technical support. In college I worked in desktop support for the university, wandering around campus or in the IT shop, repairing printers, recovering data from dying hard drives, running virus scans, and updating software. After graduation I eventually found a temp job working phone support with University of Michigan, managing to turn that position into a full-time permanent role and taking on two different queues of calls and emails. However, after a year I realized that was super exhausting to me. I couldn’t handle being “on” all day, and I found myself enjoying writing the knowledge base articles that would record solutions for common customer calls. I wrote fifty of them by the time I discovered a posting for an associate-level documentation specialist.

I managed to get that position, and transferred over to work with a fantastic mentor that taught me a ton about writing and communicating. After a few years in that position, writing everything from communication plans (and the accompanying communications), technical documentation, as well as a couple video scripts, I chose to move to California. With that came another set of job hunting, and realizing that there are a lot of different job titles that technical writing can fall under: UI writer, UI copywriter, technical writer, documentation specialist, information developer… I set up job alerts, and ended up applying, interviewing, and accepting an offer for a technical writing position at Splunk. I’ve been at Splunk for several years now, and recently returned to the documentation team after spending nearly a year working in product management.

Where people commonly go to technical writing from

Technical writers can get their start anywhere! Some people become technical writers right out of college, but others transition to it after their career has already begun.

As a technical writer, your college degrees doesn’t need to be in technical writing, or even a technical-specific or writing-specific field. I studied international studies, and I’ve worked with colleagues that have studied astronomy, music, or statistics. Others have computer science or technical communication degrees, but it’s not a requirement.

For people transitioning from other careers, here are some common starting careers:

  • Software developers
  • UX practitioners
  • Technical support

That’s obviously a short list, but again if you care about the user and communication in your current role, that background will help you immensely in a technical writing position.

Prepare for a technical writing interview

Prepare a portfolio of writing samples

Every hiring manager wants to see a collection of writing samples that demonstrate how you write. If you don’t work in technical writing yet, you might not have any. Instead, you can use:

  • Contributions you’ve made to open source project documentation. For example, commits to update a README: https://github.com/yahoo/gryffin/pull/1
  • How-to processes you’ve written. For example, instructions for performing a code review or a design review.
  • A blog post about a technical topic that you are familiar with. For example, a post about a newly-discovered functionality in CSS.
  • Basic task documentation about software that you use. For example, write up a sample task for how to create a greeting card in Hallmark Card Studio.

Your portfolio of writing samples demonstrates to hiring managers that you have writing skills, but also that you consider how you organize content, how you write for a specific audience, and the level of detail that you include based on that audience. The samples that you use don’t have to be hosted on a personal website and branded accordingly. The important thing is to have something to show to hiring managers.

Depending on the interviewer, you might perform a writing exercise in-person or as part of the screening process. If you don’t have examples of writing like this, that’s a good reason to track down some open source projects in need of some documentation assistance!

Learn about the organization and documentation

Going in to the interview, make sure you are familiar with the organization and its documentation.

  • Read up about the organization or company that you are interviewing with. If you can, track down a mission statement for the organization.
  • Find the different types of documentation available online, if possible, and read through it to get a feel for what the team might be publishing.
  • If the organization provides a service or product that you’re able to start using right away, do that!

All of these steps help you better understand how the organization works, what the team you might be working on is producing, and demonstrates to the interviewer that you are motivated to understand what the role and the organization are about. Not to mention, this makes it clear that you have some of the necessary skills a technical writer needs when it comes to information-gathering.

Questions you might want to ask

Find out some basic team characteristics:

  • How many other technical writers are at the organization?
  • What org are the technical writers part of?
  • Is there a central documentation team or are the writers scattered across the organization?
  • How distributed is the documentation team and/or the employees at the organization?

Learn about the documentation process and structure:

  • What does the information-development process look like for the documentation? Does it follow semi-Agile methods and get written and researched as part of the development team, or does information creation follow a more waterfall style, where writers are delivered a finished product and expected to document it? Or is it something else entirely?
  • Are there editors or a style guide?
  • Do the writers work directly with the teams developing the product or service?
  • What sort of content management system (CMS) is in use? Is it structured authoring? A static-site generator reliant on documentation files written in markdown stored next to the code? A wiki? Something else?

Find out how valuable documentation is to the organization:

  • Do engineers consider documentation vital to the success of the product or service?
  • Do product managers?
  • Do you get customer feedback about your documentation?
  • What is the goal of documentation for the organization?

Some resources for getting started with technical writing

Books to read

These books cover technical writing principles, as well as user design principles. None of these links are affiliate links, and the proceeds of the book I helped author go to charity.

  • The Product is Docs by Christopher Gales and the Splunk documentation team
    • Yes, I helped.
  • Every Page is Page One by Mark Baker
    • This book is a great introduction and framework for writing documentation for the web.
  • Developing Quality Technical Information by Michelle Carey, Moira McFadden Lanyi, Deirdre Longo, Eric Radzinski, Shannon Rouiller, and Elizabeth Wilde.
    • This book is a great resource and reference for detailed writing guidance, as well as information architecture.
  • Design of Everyday Things by Don Norman
    • The classic design book covers user-focused principles that are crucial to writing good documentation.

This is an intentionally short list featuring books I’ve found especially useful. You can also consider reading Scenario-Focused Engineering: A toolbox for innovation and customer-centricity, Nicely Said: Writing for the Web with Style and Purpose, Content Everywhere: Strategy and Structure for Future-Ready Content, Design for How People Learn, and Made to Stick: Why Some Ideas Survive and Others Die.

Articles and blogs about technical writing

I like following resources in RSS feeds to get introduced to good thinking about technical writing, but not all good content is new content! Some great articles that have helped me a lot:

Blogs to follow (intermittently updated)

Great articles about technical writing

Other web resources

Twitter is a great resource for building a network of people that care about documentation. If you use it, I recommend searching for people who commonly tweet with #writethedocs.

Write the Docs is a conference and community founded by Eric Holscher and maintained by a brilliant set of volunteers!

The Write the Docs Slack workspace is fairly active, and includes channels for job postings, career advice, as well as current discussions about trends and challenges in the technical writing world.

Some talks from the conference I recommend checking out are visible on YouTube:

There are playlists for 2018 (which I did not attend) and earlier years as well on YouTube, so dig around there and find some more resources too if watching videos is useful to you!