Unix 101 : Showing non-printing characters in text files (ex : DOS files)

A non-printing character is a character which won’t actually get directly printed (or displayed) but rather interpreted. Such non-printing characters are for example line-feed or tabulation. The interpretation of those characters can differ from one system to the next. For example the line-feed character is different on Unix or DOS.

If you need an easy way to confirm that a text file is DOS or UNIX formatted (they differ with respect to the end of line character(s) for example) or if you wish to display normally non-printing characters of a text file, you can use the -vET command line switches of the cat utility.

As explained in the man page :

  • -v : will use the ^ and M- notation for control and multibytes characters
  • -E : will make ends of lines visible
  • -T : will make tabulations visible

For example :

% cat -vET test.txt
a test message for$
showing the^Iuse of   cat -vET$
^I$
that is all$

and compare it to the following plain cat output, without any command line option :

% cat test.txt     
a test message for
showing the     use of   cat -vET

that is all

You can easily tell what blank space is a space or actually a tabulation for example. If you’d happen to see ^M characters popping out at the end of the lines, then you this is actually a DOS text file (which you might need to convert to UNIX style end of lines with an utility such as dos2unix)