Home About the Author

Chapter Introduction: System Documentation

Documentation: no one wants to create it, but everyone wants to have it. If we have done our due diligence by creating docstrings, we are most of the way there. We can turn our docstrings into formal documentation via a framework called Sphinx. Better yet, we can automatically generate this documentation each time we push to our remote git repository. In addition, we'll cover some topics and tools to round out your project documentation.

Revisiting Docstrings

We have used pretty straightforward docstrings using the reStructuredText (reST) format. However, as you might have guessed, we could employ other styles as shown in the below code.

Type Hinting

Type hinting is a nifty little way to supply more context about our functions. In the code below, we communicate that the argument db_secret_name is expected to be a string, and the function get_most_recent_db_insert returns a string.

Documenting Code with Sphinx

Sphinx is probably the most popular framework for documenting code. It keys off our docstrings, meaning we can pretty easily create sleek-looking documentation. Better yet, we can automatically update our documentation each time we make a code change. We can accomplish this aim by building it into our buildspec.yml. However, we first need to do some set up.

We first need to create a docs directory.

$ mkdir docs && cd "$_"

We can then run the sphinx quickstart to create our project, filing out the appropriate details.

$ sphinx-quickstart

Next, we shall configure our conf.py file.

We also must configure our index.rst file, which will actually declare what files we want to document.

We shall now create the actual documentation files!

$ make html

Let's use the following aws-cli command to copy our website code onto the S3 website.

$ aws s3 cp docs/ s3://churn-model-ds-docs/docs --recursive

We can use the following Pulumi script to create a static website on S3 to host our documentation.

Using the above as inspiration, we can amend our buildspec.yml. Our CI/CD pipeline will now include automatically updating our documentation! (We need to make sure our full docs directory is part of our .gitignore). We simply need to run the following sequence of commands.

$ cd docs
$ make html
$ aws s3 cp docs/ s3://churn-model-ds-docs/docs --recursive"

We can now the visit the our Route53 DNS to view our documentation! If we configured /docs/build/html/index.html, then our main documentation should appear in our home path.

Adding API Documentation with Swagger

We can use flasgger to automatically generate a documentation endpoint for our API. We need a particular style of docstring, which can be found in the example below and in the library's documentation. We simply need to pip install the library, import it, and then call swagger = Swagger(app) in our application. We can then visit http://localhost:5000/apidocs/ to see our documentation. If you're using Flask Talisman, you might have to make adjustments to the content security policy.

The ReadMe

The ReadMe is a core component of every git repo. It's documentation for all developers that will work on the repo, including yourself. Knowing how to develop a strong ReadMe can admittedly be a little tricky. And, well, no one really likes to write documentation, right? We can make the process a bit less painless by creating a template we can repeatedly follow. The below is the ReadMe for our churn project. The general structure can be cribbed for other data science projects.

Building LucidCharts

Lucidchart is a powerful software for creating diagrams of software systems. There is a robust free option, and it's remarkably intuitive. Below is a Lucidchart diagram for our production application. Lucidchart is so easy that I believe you could easily replicate what I have constructed.

LucidChart

Applied Full Stack Data Science