14
qr-code

File names, and what you need to know

TL;DR

If you work in any capacity with Microsoft Windows (especially if you're working in IT), you should always set Windows not to hide the extensions of filenames.

Microsoft and Microsoft Windows has taught us some core facts about files over the years:

  1. All files have extensions to their names *.jpg, *.gif, *.xls, *.docx, etc.
  2. The extension of a filename tells you what the file is all about:
    • *.jpg is an image file,
    • *.gif is a short, funny, movie1
    • *.xls is an old-format Microsoft Excel file
    • *.docx is a new-ish-format Microsoft Word file
  3. Because of facts 1. and 2. above, we don't need to see the filename extension when looking at lists of files, as we can trust Microsoft Windows' judgement as to what they are.

All of these core facts are wrong.

In the old DOS world the names of files were of the format <name>.<ext>, where <name> was limited to 8 alphanumeric characters and <ext> was limited to 3, and could not be omitted.

This was just the DOS world, though. This format and the extension requirements were relaxed in Windows 95 (>20 years ago). It so happens, though, UNIX never forced file name formats like this. Nor did Apple's Macintosh operating systems.

While extensions can be useful, it's not true to say that files require them nor, even, have them. The convention, for example, in the Linux world is for executable programs not to have filename extensions.

If a filename has an extension, however, the only inference you can take from that is that the filename has an extension.

Consider the following:

  • A file was created in 2005 with Microsoft Word and was saved in the native format of that version, often identified as "Word 97-2003 Document (*.doc)". Let's say its name is AnnualAccountsReport2005.doc.
  • In 2015, you upgrade the file to the latest Microsoft Word format by renaming it, changing the file's extension from .doc to .docx, resulting in it having the new name AnnualAccountsReport2005.docx.

If we go by the rule that the extension tells us what's in the file, then this makes sense: we've changed the filename's extension, and because the extension tells us what's in the file, it must now be in the more modern Microsoft Word file format. Do it. What you'll see is that Microsoft Word will report an error when opening the renamed file, because the extension (the new one) doesn't match the format (the old one, which the file still uses as all you did was change the file's name).

Now, because the first core fact is not true and the second core fact is not true, the third core fact isn't true either: hiding the extension of a filename is – it turns out – not a wise thing to do.

Remember the love bug? I do.

Also known as the "ILOVEYOU" computer virus, one of the reasons for its success is because of those core facts that I have just shown to be false. Most recipients of the e-mailed virus would have seen an attached file with the name LOVE-LETTER-FOR-YOU.txt, but its full name was LOVE-LETTER-FOR-YOU.txt.vbs. The .vbs was hidden because that's what Microsoft Windows does by default. By double-clicking on it, users thought it was a text file to be opened in something like NOTEPAD or WordPad, but what happened was that the file was opened as a runnable script, which then caused damage to the user's computer.

Unfortunately, modern Linux desktop environments seem to work – unnecessarily – on this same model that the filename extension informs the system as to the file's contents, which then informs the system what tool to use to open it.

Over the years I have encountered many software development issues that have been solved by prompting the developer to look inside the file rather than rely on the filename's extension.

In fact, it's my firm opinion that all IT professionals should change their windowing environment's settings to force the showing of filename extensions.

Footnotes:

1
No it's not, but I've no idea how this misunderstanding came about!