There are many reasons that I think that using text based formats for as much as possible is vastly superior to other binary formats.
Firstly, by ‘text based formats’ I mean that one writes a document and that file can be seen and edited in any one of a number of text editors on any platform, without a special piece of software.
There are a number of types of documents that you can write in plain old text - text oriented documents like this article, or a more scholarly essay, music, configuration files, data, graphs; all of which can be edited on tablets, phones, computers, by Windows, Macs, and Linux boxes.
Another way to think about this is in terms of a 100 year infrastructure - what formats do you want your documents in in 100 years time? A significant number of documents written in the recent past will be in formats that will not exist on that timescale. Text will though.
What you can do with your text documents
At this point we need to look at ‘the unix way’ by which I refer to a philosophy that all unix-based operating systems work - (1) text based configurations and (2) lots of small, task-specific applications and tools that do one thing well. Many of these tools have text as their input and just as often have text as their output. Many manipulate that text and trasform it or produce something else as a result of it.
What I mean by this is that I can take a text document such as this and input it into a tool like markdown or pandoc and produce a html output. Or using pandoc I can take this same text and produce a pdf document from it.
Or with a minor tweak I can take this document and filter it through a tool that will substitute all instences of ‘I’ with ‘I, Rohan’ and then produce a pdf. Tools such as sed make this quite straightforward and can achieve much in a single line. See this introduction to Sed for a taster.
If I have two copies of a configuration file in text I can simply ask the diff tool to compare them and it will give me a line by line breakdown of all additions, deletions and changes.
I can utilize the git suite of tools to provide versioning to this document and track all changes to a series of documents in a repository. In fact, this website is actually a git repository that has been filtered through a tool to produce the html that you now see when reading this.
It seems that there is a growing recogntions of the value of text-based workflow as they allow one to use a variety of tools that avoids both vendor and platform lock-in, can be quickly synchronised across devices and be transformed as needed.
Some exmaples of the components of a workflow are:
- OrgMode for emacs; or for ViM - a text based todo, calendar and notes system.
Scholarship and plain text
When I first started writing in an academic context I was using a Linux system1 and the available tools for document editing were limited. I chose the reasonable alternative to Microsoft Word - OpenOffice and got on writing.
Early on I realised that it would be useful to have some kind of bibliography tool to enter in the details of the references that I was coming across and so I searched around and found Bibdesk which I’ve been using ever since. Bibdesk enabled me to enter in the details, select the citation style and copy and paste the output into my documents. Pretty good - anything to make it faster and more consistent is a win.
But as a curious person I saw that Bibdesk was a frontend to a bibtex database. And what is that? So I entered the world of TeX. Maybe is was a little driven by procrastination, but I was won over by the case for WYSIWYM and so moved my writing environment to LyX which is a frontend to the TeX environment. Effectively, I was writing in plain text and letting the computer do all the thinking:
TeX is a typesetting language. Instead of visually formatting your text, you enter your manuscript text intertwined with TeX commands in a plain text file. You then run TeX to produce formatted output, such as a PDF file. Thus, in contrast to standard word processors, your document is a separate file that does not pretend to be a representation of the final typeset output, and so can be easily edited and manipulated.2
Not having to wrestle with formatting documents is extermely useful, you just get on and write, text is text, paragraphs work, headings are headings, lists are lists - you tell the computer what this unit of text is and it does the work to render it so. And 99% of the time you get what you are after without any fuss or wrangling. And it looks fantastic. The pdf output of the LaTeX process is just spot on print quality typesetting.