Technical Writer Application for 2020 Season of Docs

Photo by Kaitlyn Baker on Unsplash

Documentation is essential to the adoption of open source projects as well as to the success of their communities. Season of Docs brings together technical writers and open source projects to foster collaboration and improve documentation in the open source space.

https://opensource.googleblog.com/2020/06/season-of-docs-now-accepting-technical.html

Open source organization for proposal

Bokeh

Title of technical writing project proposal

Improving the Documentation Experience for Bokeh Developers

Description of technical writing project proposal

Current documentation state

Bokeh has done a tremendous job in documenting visualization use cases in the User Guide [1]. In the Reference [2], you can find all the API methods afforded by their models. The documentation has grown large and there is no easy way to find misspellings, repetition errors, or formatting issues in the text [3].

You can find dozens of code examples on how you might use Bokeh with your own data on GitHub[4]. You can find some of these examples inline in the documentation but not all of them are referenced[5]. Users may spend a considerable amount of time trying to figure out how a tool works without realizing there exists code they can reference. For example, you can use Themes to style a plot on Bokeh but these examples exist in the Reference when one would expect to find an example listed inline or referenced in the User Guide [6][7].

Lastly, a subset of the Bokeh documentation could benefit from the inclusion of metadata. Bokeh uses Sphinx to build documentation. Sphinx[8] is a tool that makes it easy to document software projects. This tool does not automatically include any structured data on the HTML pages it generates. Metadata in this case is metadata about the HTML pages. When searching for “Bokeh docs” on a search engine, the results users get back do not describe the content of the page. When sharing links to the Bokeh documentation on social media sites or forums, there is no way to preview the content on the page before clicking on links.

Proposed documentation state

Automated checks for spelling, repetition, and writing style errors

Vale Linter [9] is available as a GitHub Action [10]. It checks for spelling, repetition, and styling issues on every pull request. This Action can be added to the existing build process Bokeh uses for pull requests on GitHub. Automated checks would find existing errors in the documentation to fix. This technology would prevent future errors from creeping into the documentation. Vale Linter can also enforce a consistent writing style across all documentation. For example, suggesting the term "JavaScript" over "Javascript," preferring active voice over passive voice, etc.

Additional cross-referencing across docs

Different parts of the documentation should link back and forth for a more complete discussion. Users interested in learning more about a topic should be able to navigate to the Reference from the User Guide. Users interested in seeing an example of an API method should also be able navigate to the User Guide from the Reference. All examples found in the GitHub repository should either be referenced or exist inline in the documentation.

Metadata across docs

Search engines extract and collate the metadata found on web pages to describe and classify them. Including metadata, such as descriptions, in the Bokeh documentation would give users more data when browsing search engine result pages. This metadata would also provide rich previews when sharing links to these pages. Some metadata would appear alongside these links, giving readers a preview of the content before clicking. Specifying HTML metadata, like a description, can be done by manually adding the the "meta" directive on some pages. Later,  Sphinx extensions can be developed to automate adding relevant metadata throughout the entire documentation.

Timeline

Pre-community bonding

  • Stay active as a contributor by tackling documentation issues
  • Start a friction log to keep track of areas of documentation needing improvements

Community bonding

  • Establish project requirements
  • Schedule a time to meet with mentors
  • Agree on method of providing progress and updates

Week 1

  • Set up and test Vale to check for existing spelling and repetition errors
  • Identify terms to ignore that cause spelling errors like http, Bokeh, JupyterLab, etc.
  • Add a new text file with list of terms to ignore when checking for spelling errors

Week 2 and Week 3

  • Identify suggested terms to use throughout documentation for consistency
  • Add a new style guide for suggested terms
  • Configure Vale to run on every pull request submitted to Bokeh

Week 4 and Week 5

  • Start working on improving cross-referencing across Bokeh documentation
  • Identify existing Bokeh examples not shown in-line in documentation
  • Link examples in the documentation to the source code location on GitHub

Week 6 and Week 7

  • Review topics covered in the User Guide
  • Identify topics to link to sections in the Reference

Week 8

  • Identify pages on https://bokeh.org/ and manually add metadata
  • Investigate existing Sphinx extensions that can be used to add metadata across docs

Week 9

  • Integrate existing Sphinx extension or develop a new Sphinx extension to automatically add metadata across docs

Week 10

  • Test Sphinx extension(s)

Week 11

  • Finish remaining tasks
  • Start working on Season of Docs project report

Week 12

  • Finish project report
  • Submit project report to Google

References

  1. User Guide - https://docs.bokeh.org/en/latest/docs/user_guide.html
  2. Reference - https://docs.bokeh.org/en/latest/docs/reference.html
  3. Documentation spelling and formatting - https://github.com/bokeh/bokeh/issues/8448
  4. Bokeh Examples - https://github.com/bokeh/bokeh/tree/master/examples
  5. Include example code of PolyEditTool and PolyDrawTool Docs - https://github.com/bokeh/bokeh/issues/9962
  6. Add mention of Themes to "Styling Visual Attributes" docs page - https://github.com/bokeh/bokeh/issues/9007
  7. Reference Guide should link to Users Guide where appropriate. - https://github.com/bokeh/bokeh/issues/9363
  8. Sphinx - https://www.sphinx-doc.org/en/master/
  9. Vale - https://github.com/errata-ai/vale
  10. Vale Linter - https://github.com/marketplace/actions/vale-linter

Building a search app with Django and Haystack

Goal

The goal of this tutorial is to build a search app using Django and Haystack You will learn how to use Django commands to initialize a database with emoji data. You will also learn how to add search to a Django project using Haystack.

Upon completion, you will have a built an app that allows you to search for over a thousand emojis. This app also gives you the ability to copy any emoji to your clipboard with one click.

Before you start

Make sure you meet the following prerequisites before starting the tutorial steps:

This project depends on Pipenv. Pipenv allows you to download and install versions of packages in a virtual environment.

Another prerequisite is Elasticsearch. An Elasticsearch instance needs to run separate from the app.

Installing packages

The app depends on the following packages:

Open up a terminal prompt and create a directory called emoji-in-the-haystack:

mkdir emoji-in-the-haystack
cd emoji-in-the-haystack

Install the packages:

pipenv install django==3.0.7
pipenv install git+https://github.com/django-haystack/django-haystack.git#egg=django-haystack
pipenv install elasticsearch==5.5.3
pipenv install requests==2.24.0

You’ll see a bunch of colorful output and a couple of 🐍 emojis. In this directory, you should now see the files Pipfile and Pipfile.lock.

You’re ready to create a Django project.

Setting up a Django project and app

After installing the packages, the next step is to create a Django project.

Activate your virtual environment:

pipenv shell

You should now see your terminal prompt prefixed with (emoji-in-the-haystack).

Create a Django project called emoji_haystack:

django-admin startproject emoji_haystack .

The directory should now look like this:

├── Pipfile
├── Pipfile.lock
├── manage.py
└── emoji_haystack
   ├── __init__.py
   ├── asgi.py
   ├── settings.py
   ├── urls.py
   └── wsgi.py

Create a Django app called search:

python manage.py startapp search

The directory should now look like this:

├── Pipfile
├── Pipfile.lock
├── manage.py
├── emoji_haystack
│   ├── __init__.py
│   ├── asgi.py
│   ├── settings.py
│   ├── urls.py
│   └── wsgi.py
└── search
   ├── __init__.py
   ├── admin.py
   ├── apps.py
   ├── migrations
   │   └── __init__.py
   ├── models.py
   ├── tests.py
   └── views.py

You need to enable the newly created app.

Update the INSTALLED_APPS setting in settings.py:

33
34
35
36
37
38
39
40
41
42
INSTALLED_APPS = [
   'django.contrib.admin',
   'django.contrib.auth',
   'django.contrib.contenttypes',
   'django.contrib.sessions',
   'django.contrib.messages',
   'django.contrib.staticfiles',

   'search.apps.SearchConfig',
]

To test that everything is working, run the app:

python manage.py runserver

Navigate to http://127.0.0.1:8000/ and confirm that the app is working.

Note: You can run python manage.py migrate to get rid of the Django warnings when running the app.

Emoji data

The next step is to create a Django model class to represent the emoji data.

Update models.py:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
from django.db import models


class Emoji(models.Model):
    name = models.CharField(
        max_length=50,
    )
    code = models.CharField(
        max_length=50,
    )

You need to store the name for each emoji. For example, “grimacing face” is the name given to 😬. You also need to store the code for an emoji. These code points are unique for every emoji. Django handles rendering emojis in the browser using these codes.

After creating the model, run a migration to apply these changes to the database:

python manage.py makemigrations --name add_emoji_model search
python manage.py migrate

The next step is to create a new directory for the Django command. Django commands are special scripts registered in Django projects.

The command in this app retrieves emoji data and saves it to the database using the Emoji model class. This commands must live in the new directory.

Create the new directory:

cd search
mkdir management
cd management
mkdir commands
cd commands

Inside this commands directory, create the initemojidata command:

touch initemojidata.py

The directory should now look like this:

├── Pipfile
├── Pipfile.lock
├── db.sqlite3
├── emoji_haystack
│   ├── __init__.py
│   ├── asgi.py
│   ├── settings.py
│   ├── urls.py
│   └── wsgi.py
├── manage.py
└── search
   ├── __init__.py
   ├── admin.py
   ├── apps.py
   ├── management
   │   └── commands
   │       └── initemojidata.py
   ├── migrations
   │   ├── 0001_add_emoji_model.py
   │   └── __init__.py
   ├── models.py
   ├── tests.py
   └── views.py

Here is the code to retrieve and save emoji data:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
import json
import requests

from django.core.management.base import BaseCommand, CommandError

from search.models import Emoji


EMOJI_JSON_URL = 'https://raw.githubusercontent.com/iamcal/emoji-data/master/emoji.json'


class Command(BaseCommand):
    help = 'Initialize database with emoji data'

    def add_arguments(self, parser):
        parser.add_argument(
            '--dry-run',
            action='store_true',
            default=False)

    def execute(self, *args, **options):
        self.count = 0

        try:
            super().execute(*args, **options)
        except KeyboardInterrupt:
            self.stdout.write('')

        self.stdout.write(self.style.SUCCESS(
            'Emojis created: {}'.format(self.count)))

    def handle(self, *args, **options):
        self.dry_run = options['dry_run']

        emojis = self.get_emojis()

        for emoji in emojis:
            if not emoji.get('name'):
                continue

            code = self.handle_code(emoji)
            name = emoji['name'].lower()
            self.stdout.write(
                '{} - {}'.format(name, code))

            if not self.dry_run:
                emoji = Emoji(
                    name=name,
                    code=code)

                emoji.save()

            self.count += 1

    def get_emojis(self):
        response = requests.get(
            url=EMOJI_JSON_URL)

        emojis = json.loads(response.content)

        return emojis

    def handle_code(self, emoji):
        """
        U+1F1EC, U+1F1FE - > &#x1F1EC&#x1F1FE
        """
        unified = emoji.get('non_qualified') or emoji.get('unified')
        unified = unified.split('-')

        codes = []
        for code in unified:
            _code = '&#x' + code
            codes.append(_code)

        return ''.join(codes)

The syntax for Django commands may take some time getting used to. Django commands require a Command class definition that subclasses BaseCommand. This class requires a handle() method. Your logic goes in here.

I use the execute() method to define some variables to count and output the number of items updated when a command finishes running.

On line 35, the get_emojis() method defined on the class gets called using the self property. The method makes a request to the URL defined on line 9. This endpoint is a JSON file hosted on GitHub.

It may not include the newest emojis but it’s the best option for this app. The Emojipedia API is no longer available for public use. Typically you need to handle errors when making API requests but it’s fine to leave out here.

The command retrieves the emoji data and begins to process each data item on line 37. It ignores data items with no name field. On line 41, the command calls the handle_code(). This method transforms the emoji unicode data into a string that gets stored in the database. The transformation of this unicode data makes it possible to render emojis in HTML. More on this later.

You can run this command with an optional dry_run argument. Providing this argument means you can test your Django command logic without saving anything to the database. If this argument is not passed in when running the command, the command creates an Emoji object with name and code set and saves it to the database.

Django commands are ran from the root of the project.

Run the Django command (--dry-run option):

python manage.py initemojidata --dry-run

Run the Django command (no regrets option):

python manage.py initemojidata

The emoji data is now stored in the database.

Haystack setup

Haystack makes it easy to add custom search to Django apps. You write your search code once and can go back and forth between search backends as you please. You can choose to use different search backends like Elasticsearch, Solr, and others. This tutorial uses Elasticsearch.

Integrating Haystack consists of creating a search index model and updating a couple of Django settings.

The search index model corresponds to the database model defined earlier. Haystack requires this file to know what data to place in the search index.

Inside the search app directory, create a search_indexes.py file:

cd search
touch search_indexes.py

Here’s what the code for that looks like:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
import datetime

from haystack import indexes
from search.models import Emoji


class EmojiIndex(indexes.SearchIndex, indexes.Indexable):
    text = indexes.CharField(document=True, use_template=True)

    def get_model(self):
        return Emoji

When you make search a query, Haystack searches the text field. This field corresponds to the name field defined in the Emoji model.

Next, include the urls provided by Haystack in urls.py. Django implicitly calls a custom Haystack view that handles search requests and returning responses. This response uses an HTML template that you need to create and configure. More on this later.

16
17
18
19
20
21
22
from django.contrib import admin
from django.urls import include, path

urlpatterns = [
    path('admin/', admin.site.urls),
    path('search/', include('haystack.urls')),
]

You need to enable the Haystack app.

Update the INSTALLED_APPS setting in settings.py:

33
34
35
36
37
38
39
40
41
42
43
44
INSTALLED_APPS = [
    'django.contrib.admin',
    'django.contrib.auth',
    'django.contrib.contenttypes',
    'django.contrib.sessions',
    'django.contrib.messages',
    'django.contrib.staticfiles',

    'search.apps.SearchConfig',

    'haystack',
]

Add a connection to Elasticsearch in settings.py:

127
128
129
130
131
132
133
134
135
136
# Haystack configuration
# https://haystacksearch.org

HAYSTACK_CONNECTIONS = {
    'default': {
        'ENGINE': 'haystack.backends.elasticsearch5_backend.Elasticsearch5SearchEngine',
        'URL': 'http://127.0.0.1:9200/',
        'INDEX_NAME': 'haystack',
    },
}

Haystack setup continued

The following steps are cumbersome but they are essential in getting Haystack to work.

In settings.py, update the TEMPLATES setting:

58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
TEMPLATES = [
    {
        'BACKEND': 'django.template.backends.django.DjangoTemplates',
        'DIRS': [os.path.join(BASE_DIR, 'templates')],
        'APP_DIRS': True,
        'OPTIONS': {
            'context_processors': [
                'django.template.context_processors.debug',
                'django.template.context_processors.request',
                'django.contrib.auth.context_processors.auth',
                'django.contrib.messages.context_processors.messages',
            ],
        },
    },
]

From the root of the project, create a templates directory:

mkdir templates
cd templates

Creating a single project-level templates directory is a recognized Django pattern.

In the templates directory, create a search directory and a file called search.html:

mkdir search
cd search
touch search.html

In the search directory, create an indexes directory:

mkdir indexes
cd indexes

In the indexes directory, create a search directory and a file called emoji_text.txt:

mkdir search
cd search
touch emoji_text.txt

Here’s what emoji_text.txt should look like:

{{ object.name }}

Haystack uses this data template to build the document used by the search engine.

The final directory structure should look like this:

├── Pipfile
├── Pipfile.lock
├── db.sqlite3
├── emoji_haystack
│   ├── __init__.py
│   ├── asgi.py
│   ├── settings.py
│   ├── urls.py
│   └── wsgi.py
├── manage.py
├── search
│   ├── __init__.py
│   ├── admin.py
│   ├── apps.py
│   ├── management
│   │   └── commands
│   │       └── initemojidata.py
│   ├── migrations
│   │   ├── 0001_add_emoji_model.py
│   │   └── __init__.py
│   ├── models.py
│   ├── search_indexes.py
│   ├── tests.py
│   └── views.py
└── templates
   └── search
      ├── indexes
      │   └── search
      │       └── emoji_text.txt
      └── search.html

Search template

Now it’s time to update search.html. This template contains a text field to type in a search query, a button that fires a search request and some template variables. Use the template example found here.

Note: Remove {% extends 'base.html' %} at the top of the file.

The main differences in the template for this tutorial are the following two lines:

18
19
20
21
{% for result in page.object_list %}
   <p>{{ result.object.code|safe }}</p>
   <p>{{ result.object.name }}</p>
{% empty %}

object_list is a list of search results. For each search result, display the emoji and its name. result.object provides direct access to the Emoji model and its database fields.

Displaying the emoji requires using the safe Django filter. It does not require further HTML escaping.

Running Elasticsearch

Navigate to the location of your Elasticsearch installation and start an instance. For example, say you downloaded Elasticsearch in your Downloads folder:

cd Downloads
cd elasticsearch-5.5.3
cd bin
elasticsearch

Haystack ships with a set of Django commands that handle indexing the emoji data stored in the database. This tutorial uses the rebuild_index command. This command rebuilds the search index by first clearing it and then updating it. Have a look at the source code for more info.

From the root of the project, run the command:

python manage.py rebuild_index

Run the app:

python manage.py runserver

Navigate to http://127.0.0.1:8000/search and confirm that the app is working.

If you query for “cat,” you get back a list of results. If you query for “flag,” you get back results for flag emojis.

If you scroll to the bottom, you’ll see a Previous button and Next button. Haystack returns at most 20 results per page. This out of the box feature is awesome. The layout needs a little bit of work though.

Bootstrap + clipboard.js

You can use Bootstrap to clean up the design. Another feature is to copy an emoji to your clipboard by clicking on it - clipboard.js can help here.

Load Bootstrap and clipboard.js from CDN in search.html:

1
2
3
4
5
6
7
8
9
<script src="https://cdn.jsdelivr.net/npm/clipboard@2/dist/clipboard.min.js"></script>

<!-- Bootstrap CSS -->
<link rel="stylesheet"
href="https://stackpath.bootstrapcdn.com/bootstrap/4.1.3/css/bootstrap.min.css"
integrity="sha384-MCw98/SFnGE8fJT3GXwEOngsV7Zt27NXFoaoApmYm81iuXoPkFOJwJ8ERdknLPMO"
crossorigin="anonymous">

{% block content %}

A couple of Bootstrap <div> elements and some styling updates go a long way in improving the look of the app.

Including the data-clipboard-text attribute on the emoji button lets you copy emojis to your clipboard:

23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
        {% if query %}
            <h3>Results</h3>

            <div class="container">
            <div class="row">
            {% for result in page.object_list %}
                <div class="col-sm">
                    <button type="button" class="btn" data-clipboard-text="{{ result.object.code|safe }}" style="font-size:90px;">{{ result.object.code|safe }}</button>
                    <p style="text-align: center">{{ result.object.name }}</p>
                </div>
            {% empty %}
                <p>No results found.</p>
            {% endfor %}
            </div>
            </div>

The last thing to do is to initialize clipboard.js in search.html:

50
51
52
53
54
55
56
57
58
59
60
61
62
{% endblock %}

<script>
    var clipboard = new ClipboardJS('.btn');

    clipboard.on('success', function(e) {
        console.log(e);
    });

    clipboard.on('error', function(e) {
        console.log(e);
    });
</script>

Run the app with these new changes:

python manage.py runserver

Navigate to http://127.0.0.1:8000/search and confirm the changes. This looks much better. The emojis are more prominent and the click-to-copy feature is the 🍒 on top.

What you’ve learned

Rejoice and show your friends how to find the emoji in the haystack. If you’re up for the challenge, see if you can make the following app improvements:

  • Load a subset of emojis on the homepage before a user searches

  • Add a navigation bar to filter by emoji category

  • Support for newer emojis

Webhook signatures for fun and profit

Webhooks - less painful than playing hooky by skipping work.

Image source: Encyclopedia SpongeBobia

What's a webhook?

Application programming interfaces (API) consist of client requests and server responses. Webhooks are the reverse of APIs! A third-party service (e.g. server) will send data to one or more configured listeners (e.g. clients). You can set up a listener to consume webhook events by following these steps:
  1. create a new URL in your web application to listen for events (e.g. mycoolapp.com/webhooks)
  2. create a secret token with your third-party service (e.g. GitHub repository settings)
  3. give your application access to this secret token (e.g. environment variables)
  4. deploy the application to listen for requests
  5. verify the webhook signature found in each request
  6. if the signature passes this verification step, process the event data
  7. if it doesn't pass, raise an error
Webhooks allow us to get information in real-time. Let's say we want to find out if a task has finished. Instead of polling an API and asking for the state of a task, webhooks automatically notify us when a task is done. All we have to do is verify the webhook signature.

Companies like Stripe and Twilio provide developers with software development kits (SDKs). These SDKs typically verify signatures for you. If not, have no fear! We can manually verify these signatures using Python.

Note: the terms "third-party" and "authorized users" will be used interchangeably from here on out.

Trust...

Let's assume our application was partly compromised. Our webhook URL is now public and out in the open. How do we differentiate authorized users from bad actors? Our application and the third-party service need some way to authenticate messages. One way to achieve this is to use a hash-based message authentication code (HMAC).

First, an authorized user sends a signature with every request to our application. Next, our application computes the expected signature by combining HMAC with our secret token. It compares both signatures and allows requests from this user if the signatures match. Bad actors would have a hard time trying to fool us without this secret token.

Now that we've covered secret tokens, let's take a look at the code to manually verify signatures.

...but verify


We define a request "object" on line 20. We use this object to represent a request that would normally be sent by an authorized user. This request has a signature, which is a bytes string. Let's assume the signature in the request is valid. The goal of our application is to calculate this signature using HMAC and our secret token.

The shared secret is hardcoded on line 7 for demonstration purposes. Remember, the secret should be stored as an environment variable on your server!

Next, we use the hmac and the hashlib Python modules to create a hashing object on line 9.

The method signature for the new() method is: hmac.new(key, msg=None, digestmod=''):
  • key is set to the secret token encoded in bytes
  • msg is set to the request body encoded in bytes
  • digestmod is set to the SHA-1 hashing algorithm

We get the expected signature on line 14 by encoding the digest of our hashing object using Base64. You might be able to skip this step. You should confirm if the data you receive is encoded using Base64.

On line 16 we compare the signature found in the request with the signature we expect. You typically use the == operator when comparing values in Python. Do not do this here! Heed the following warning found in the Python documentation:
Warning: When comparing the output of digest() to an externally-supplied digest during a verification routine, it is recommended to use the compare_digest() function instead of the == operator to reduce the vulnerability to timing attacks.
On line 25 we combine all of this together and verify the request. We display a thumbs up emoji for authorized users and a red light emoji for bad actors!

Wrapping up

I took a cybersecurity course my last semester in college. I'd be lying if I told you I enjoyed writing C code and setting up Ubuntu virtual machines on my Windows laptop. That being said, it's awesome seeing the theories I learned in school put to practice.

Check out these links with more information on HMAC and webhook security:

My initial thoughts on Posthaven

I've been a Posthaven user for less than a week. Here's what I've gathered about the platform:

  • Having limited themes is a good thing - I can focus more on creating content and not spend 10 hours choosing a theme;
  • The editor is rough around the edges - I wish there was support for Markdown and editing links after inserting them is broken;
  • SEO support is lacking - I'm not worried about ranking on Google but I would like the links I share on Slack to look nice;
  • Clicking "Save as Draft" is fun;
It's good enough for me. I can afford the $5 a month and the fee is a good forcing function to get me to write.

It also looks like Posthaven is still being maintained. Their Twitter account is active and one can request features. If you're reading this, you should go vote!