A journey from programming languages to modern IT infrastructure to be a great DevOps Engineer.

How I'm writing my book using Open Source tools

09 Apr 2023 - Marcelo Pinheiro

Write a book was always a dream to me. As a programmer, I always think how easy should be teaching computer programming basics, but in a different approach against with the traditional high schools and universities. I’m kinda sort of developer that learned too way more writing code than grabbing some heavy theoretical books when I visited the graduation’s library in my old times. And my dream is to teach new developers in a more friendly way, with more practice followed by the theoretical fundamentals of computer science in a good balanced between both.

Between 2015 and 2016 I had the opportunity to work as a reviewer for these two books of PacktPub:

For both, I learned a lot not only reviewing text, validating command inputs and outputs and suggesting new topics or refactor some paragraphs. I deep dived into both technologies (Vagrant and Docker) to masterize the content and validate every bullet during the chapter reviews. During this time, my desire to write a book has increased, and I tried to start one, but due to my professional moment I didn’t have enough time to start one from scratch, in my way as my taste. On these days I traveled a lot to help teams in a few projects and the routine was intense, so I needed to stop writing for a while.

So 2023 came and in the last month I realized that the time has come: write my first book as a side hustle. I’m currently working as a Senior DevOps Engineer Consultant attending a few clients and, after some calendar organization, finally I have free time enough to periodically make brain dumps for every chapter to glue later and finally finish my book. What started as a proof of concept became to be a concrete workflow to write content and I will share how I’m writing this book using a combination of Pandoc, Ruby, Node.js and Python. So let’s try explaining how the development of my book is working at this time.

Environment Setup

Firstly, we need to install Pandoc. Written in Haskell, this tool is basically a universal file converter. You can write Markdown or LaTeX files to generate Microsoft Word documents, PDF files and even EPUB books. At this time, I decided to write my book using Markdown files and sometimes mixing LaTeX blocks in the files when I want something more robust (like writing mathematical functions, for example).

Since I’m a macOS user, I use [Homebrew] as my package manager to install software; so I will assume that you are familiar with this tool. For Linux users, most of the packages are available in Ubuntu repositories, so you can use apt-get to proceed. Unfortunately, for Windows users, things will be a little bit more complicated at this time because I don’t have a Windows computer to describe all steps to set up the exact same setup.

OK, let’s start installing the following packages:

$ brew install \
    pandoc \
    librsvg \
    mactex \
    translate-shell \
    ghc \
    cabal-install

It will install the following tools:

Pandoc
librsvg (this is used by Pandoc to convert graphics like images)
mactex (LaTeX macOS version)
translate-shell (a library to translate content into different languages using Google Translate API)
ghc (the Haskell compiler)
cabal-install (the Haskell package manager)

In the sequence, you must have Node.js installed on your computer. As a developer, I commonly use nvm to install and manage different Node.js versions on my computers because I deal with different versions in several projects from my customers. The nvm installation is quite simple, as you can see here. At this time I’m using v14.17.0 but you can use a more recent version. But ok, let’s suppose you want to install the latest one:

$ nvm install v18.15.0

OK, it will result in the following output:

$ nvm install v18.15.0
Downloading and installing node v18.15.0...
Downloading https://nodejs.org/dist/v18.15.0/node-v18.15.0-darwin-x64.tar.xz...
######################################################################################################
Computing checksum with sha256sum
Checksums matched!
Now using node v18.15.0 (npm v9.5.0)

Let’s make the latest version of Node as default now:

$ nvm use v18.15.0
Now using node v18.15.0 (npm v9.5.0)

OK, it’s time to install gramma, an NPM library responsible to check grammar on files using the command line. To me is excellent because I commonly add some typos because I type too fast. Another awesome feature of gramma is the ability to check grammar in different languages, so you can check your files using en-US or en-UK - for English languages -, but you can check German or Dutch as well. Install gramma:

$ npm install --global gramma

added 119 packages in 14s

10 packages are looking for funding
  run `npm fund` for details
npm notice
npm notice New minor version of npm available! 9.5.0 -> 9.6.4
npm notice Changelog: https://github.com/npm/cli/releases/tag/v9.6.4
npm notice Run npm install -g npm@9.6.4 to update!
npm notice

Because I’m using several different source code snapshots in my book, I spent some time researching some tool able to highlight code in a similar GitHub manner. At this moment, I’m using a Python library called pandoc-include to do this job. I tried using this custom Lua script, but it is not satisfied my needs converting to both EPUB / PDF files; this Python egg is better for now. Again, I’m supposing you are not familiar with pyenv, the equivalent of nvm but for Python versions. Install pyenv using this link. After that, install the latest stable Python version:

$ pyenv install 3.11.2
python-build: use openssl@1.1 from homebrew
python-build: use readline from homebrew
Downloading Python-3.11.2.tar.xz...
-> https://www.python.org/ftp/python/3.11.2/Python-3.11.2.tar.xz
Installing Python-3.11.2...
python-build: use tcl-tk from homebrew
python-build: use readline from homebrew
python-build: use zlib from xcode sdk
Installed Python-3.11.2 to $HOME/.pyenv/versions/3.11.2

You will see an output similar to this. To make this Python version as default, type:

$ pyenv global 3.11.2

OK, time to install pandoc-include:

$ pip3 install pandoc-include

Well, almost there. Now it’s time to visit Ruby, and as the same did with Node.js and Python, I use rvm to handle multiple Ruby versions. Let’s install it here. After proceeding with required steps (specially GPG keys), run:

$ rvm install 3.2.2
Searching for binary rubies, this might take some time.
No binary rubies available for: osx/10.15/x86_64/ruby-3.2.2.
Continuing with compilation. Please read 'rvm help mount' to get more information on binary rubies.
Checking requirements for osx.
Certificates bundle '/usr/local/etc/openssl@1.1/cert.pem' is already up to date.
Requirements installation successful.
Installing Ruby from source to: $HOME/.rvm/rubies/ruby-3.2.2, this may take a
while depending on your cpu(s)...
ruby-3.2.2 - #downloading ruby-3.2.2, this may take a while depending on your connection...
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 19.5M  100 19.5M    0     0  10.7M      0  0:00:01  0:00:01 --:--:-- 10.7M
No checksum for downloaded archive, recording checksum in user configuration.
ruby-3.2.2 - #extracting ruby-3.2.2 to $HOME/.rvm/src/ruby-3.2.2.....
ruby-3.2.2 - #configuring....................................................................
ruby-3.2.2 - #post-configuration.
ruby-3.2.2 - #compiling......................................................................
ruby-3.2.2 - #installing...............
ruby-3.2.2 - #making binaries executable...
Installed rubygems 3.4.10 is newer than 3.0.9 provided with installed ruby, skipping installation,
use --force to force installation.
ruby-3.2.2 - #gemset created $HOME/.rvm/gems/ruby-3.2.2@global
ruby-3.2.2 - #importing gemset $HOME/.rvm/gemsets/global.gems......................
ruby-3.2.2 - #generating global wrappers........
ruby-3.2.2 - #gemset created $HOME/.rvm/gems/ruby-3.2.2
ruby-3.2.2 - #importing gemsetfile $HOME/.rvm/gemsets/default.gems evaluated to empty gem list
ruby-3.2.2 - #generating default wrappers........
ruby-3.2.2 - #adjusting #shebangs for (gem irb erb ri rdoc testrb rake).
Install of ruby-3.2.2 - #complete
Ruby was built without documentation, to build it run: rvm docs generate-ri

It will take some time to compile some stuff depending on the platform you use, but it’s ok. In my case, I’m writing this post using my old MacBook Pro mid-2012, so I need to compile from scratch because this laptop is 11 years old. After finishing the installation, just type:

$ rvm use 3.2.2
Using $HOME/.rvm/gems/ruby-3.2.2

OK, now we have a running Ruby to use with our Rakefile as I will cover soon.

Finally, to finish the environment setup I use a Pandoc template called [Eisvogel], that eases me a lot to format the document with CSS and customize it as I want. LaTeX default fonts are great for scientific purposes, but for a book they not sit well in my opinion, so let’s install the template now:

mkdir -p ~/.pandoc/templates/eisvogel && cd ~/.pandoc/templates
curl -L https://github.com/Wandmalfarbe/pandoc-latex-template/releases/download/v2.3.0/Eisvogel-2.3.0.tar.gz \
  --output eisvogel.tar.gz
tar jxvf eisvogel.tar.gz eisvogel

Voilà, now we have a functional template and all the required environment tools (at least to me) to start writing.

Using Rakefile

I like to write code in different languages, but Ruby is to me the easiest one to get the job done. You can create an MVP using Ruby on Rails, write small backends with Cuba, just to name a few tools. For my book, I decided to use Rake to encapsulate Pandoc calls with tasks to ease my life and reuse code. Here’s a snapshot:

FILE_NAME = "book-example"

BOOK_DIR = "book"
OUT_DIR = "out"
SOURCE_LANGUAGE_CODE = :en
SOURCE_COUNTRY_CODE = :US

LANGUAGES = [
  :en,
  :pt_br
]

FILE_EXTENSIONS = [
  :pdf,
  :epub,
]

PANDOC_DEFAULT_LATEX_TEMPLATE = '~/.pandoc/templates/eisvogel/eisvogel.latex'
PANDOC_MAIN_CALL_ARGS = %{
  cd #{BOOK_DIR} && \
    pandoc \
      --template #{PANDOC_DEFAULT_LATEX_TEMPLATE} \
      --filter pandoc-include \
      --standalone \
      --table-of-contents
}

CONTENT_FILES = [
  'metadata.yml',
]

namespace :book do
  def sanitize(cmd)
    cmd.gsub(/\s+/, " ").gsub(/\\/, "").strip
  end

  def rake_workaround!(task)
    ARGV.each {|a| task a.to_sym do ; end }
  end

  def chapter_exists?(chapter_file_path)
    Dir[chapter_file_path].count == 1
  end

  def get_argv_values()
    file_extension = ARGV[1].to_s.downcase
    language = ARGV[2].to_s.downcase.gsub(/-/, '_')
    return file_extension, language
  end

  def get_content_files_args(language)
    main_content_files = CONTENT_FILES
    chapter_files = Dir["#{BOOK_DIR}/#{language}/**.md"].collect {|c| c.gsub(/#{BOOK_DIR}\//, '') }

    main_content_files.concat(chapter_files).join(' ')
  end

  def validate!()
    raise 'No arguments found' if ARGV.empty?

    file_extension, language = get_argv_values()

    raise 'Invalid file extension' unless FILE_EXTENSIONS.include?(file_extension.to_sym)
    raise 'Invalid language' unless LANGUAGES.include?(language.to_sym)
  end

  def get_output_path(language, file_name, file_extension)
    "#{OUT_DIR}/#{language}/#{file_name}.#{file_extension}"
  end

  def translate_to_gramma(language)
    idiom, country = language.to_s.split('_')
    country = SOURCE_COUNTRY_CODE.to_s if country.nil?
    idiom.to_s.gsub(/_/, '-').gsub(country, country.upcase)
  end

  task :create_out_dir do
    folders = LANGUAGES.collect{|l| "#{BOOK_DIR}/#{OUT_DIR}/#{l}"}.join(" ")

    sh "mkdir -p #{folders}"
  end

  desc 'Build Book given format and language'
  task :build => [:create_out_dir] do |task|
    validate!()
    rake_workaround!(task)

    file_extension, language = get_argv_values()
    output_path = get_output_path(language, FILE_NAME, file_extension)
		content_files_args = get_content_files_args(language)

    output_args = %{
      --to #{file_extension} \
      --output #{output_path} \
    }

    cmd = %{
      #{PANDOC_MAIN_CALL_ARGS} \
        #{output_args} \
          #{content_files_args}
    }

    sh sanitize(cmd)
  end

  desc 'Open Book given format and language'
  task :look do |task|
    validate!()
    rake_workaround!(task)

    file_extension, language = get_argv_values()
    file_path = "#{BOOK_DIR}/#{OUT_DIR}/#{language}/#{FILE_NAME}.#{file_extension}"

    sh "open #{file_path}"
  end

  desc 'Translate Book given an language'
  task :translate do |task|
    raise 'No arguments found' if ARGV.empty?
    raise 'Invalid arguments passed' if ARGV.size > 2

    language = ARGV[1].to_s.downcase.gsub(/-/, '_').to_sym
    raise 'Unsupported language' unless LANGUAGES.include?(language)

    rake_workaround!(task)

    source_folder_path = "#{BOOK_DIR}/#{SOURCE_LANGUAGE_CODE}"
    destination_folder_path = "#{BOOK_DIR}/#{language}"

    sh "rm -rf #{destination_folder_path}"
    sh "mkdir -p #{destination_folder_path}"

    markdown_files_path = "#{source_folder_path}/*.md"
    markdowns = Dir[markdown_files_path]
    trans_language_code = /_|-/.match?(language) ? language.to_s.upcase.gsub(/_/, '-') : language

    markdowns.each do |src_md_file|
      file_name = File.basename(src_md_file)
      translate_command_call = %{
        trans #{SOURCE_LANGUAGE_CODE}:#{trans_language_code} \
          --brief \
          --input #{source_folder_path}/#{file_name} \
          --output #{destination_folder_path}/#{file_name}
      }

      sh sanitize(translate_command_call)
    end
  end

  desc 'Check Book grammar given a language chapter'
  task :grammar do |task|
    raise 'No arguments found' if ARGV.empty?
    raise 'Invalid arguments passed' if ARGV.size > 3

    language = ARGV[1].to_s.downcase.gsub(/-/, '_').to_sym
    raise 'Unsupported language' unless LANGUAGES.include?(language)

    rake_workaround!(task)

    chapter = ARGV[2].to_s
    chapter_file_path = "#{BOOK_DIR}/#{language}/#{chapter}*.md"
    raise 'No chapter found' unless chapter_exists?(chapter_file_path)

    grammar_check_command = "gramma check --language #{translate_to_gramma(language)} --markdown #{chapter_file_path}"
    sh grammar_check_command
  end
end

OK, I know that’s a lot of content to digest. Let’s dissect what this Rakefile does.

Deep Diving the Rakefile

I created several global variables to ease my life to build the book in PDF and EPUB versions. Spend some time reading this Rakefile and later take a look at the following methods:

sanitize

This method removes the backslash character into the strings I passed as argument, in this case commands to be bash-like commands.

rake_workaround!

This method is used to tell Rake that the arguments passed by the command line are not Rake Tasks, just arguments for the tasks. I like more to explicit pass arguments as I use Bash scripts instead of using the Rake interface rake::task[arg_1,arg2], it’s more user-friendly. But to not trigger the error Task not found, we need to explicit do something with the arguments, so this method does the dirty job for us.

chapter_exists?

It will check if the chapter number you passed by argument exists in the source language folder (in this case, book/en folder).

get_argv_values

Retrieve ARGV values from the command line and do everything you need, in this case I retrieve the file extension (PDF or EPUB) and the language (:en or :pt_br).

get_content_files_args

This method retrieves all Markdown files from a specific language folder ordered by name.

validate!

This method simply validates the arguments passed by the command line when invoking Rake tasks, in this case we validate the file extension and language.

get_output_path

This method returns a complete path of the artifact to be generated given and language, file name and file extension.

translate_to_gramma

This method will call grandma to check if the translation of a given chapter is correct as we want to be.

Now we have the following tasks, that can be retrieved by invoking the following command:

$ rake -T

It will output the following tasks available:

rake book:build      # Build Book given format and language
rake book:grammar    # Check Book grammar given a language chapter
rake book:look       # Open Book given format and language
rake book:translate  # Translate Book given an language

OK. Now how we can see this in practice?

Generating a book

It’s more simple than you imagine. Just type in your terminal:

$ rake book:build pdf en

It will generate a PDF containing all chapters available at book/en directory. This command will generate the following output (I put backslashes for better reading in this case):

mkdir -p book/out/en book/out/pt_br
cd book && pandoc --template ~/.pandoc/templates/eisvogel/eisvogel.latex \
  --filter pandoc-include \
  --standalone --table-of-contents \
  --to pdf --output out/en/book-example.pdf \
  metadata.yml en/01_introduction.md en/02_ruby.md en/03_python.md

To generate the EPUB equivalent:

$ rake book:build epub en

To take a look at the built file, you can type:

$ rake book:look epub en
$ rake book:look pdf en

Translating to another language

Here enters translate-shell. You can convert your Markdown files into translated files using this library with almost no effort; of course you need to care of the translated content to avoid any typos and misspelling sentences, because despite the translation engine is great, a pair of human eyes always detect some errors in the translation.

$ rake book:translate pt_br

It will generate a new folder containing all translated chapters into the folder book/pt_br. That’s important to say that gramma literally translates everything that can be, so if you have some additional configuration files they will be converted; you need to take care of putting back the original configuration

Checking grammar

Now gramma helps us to take a look into the converted files to see any typos and fix them as we want. Gramma gives you the feature of storing custom words, so if you want to add English words into gramma dictionary, it’s possible to save them and bypass checks in the future. To check the grammar from a file, just type in the console:

$ rake book:grammar pt_br <chapter_number>

It will review the chapter number you passed by argument. Some prompts will be displayed to confirm or replace by suggested words; fix it as you want.

Conclusion

OK, I know that is a lot of information to process. You can navigate into the source code hosted at GitHub here: Book Example.

If you have any questions, don’t hesitate to contact me by email. I will appreciate a lot.