Blog Entry - 23th December 2008 - Programming - JavaScript

JavaScript and The Local File System - Part 2 - Paths


(All examples are Windows only. I don't have a Mac or Linux.)

Prolog

Before I get started, I have been interested to see that the various non-browser RIA technologies are really starting to show some power.

Shine Draw demonstrates how Flash (Flex/Air) and Silverlight are neck and neck in terms of their functionality, with Flash showing some strengths (faster start-up, smoothness, anti-aliasing, fonts) and Silverlight showing some (potential better developer experience, better animation model, faster in a number of respects).

Then JavaFX has recently formally entered the fray, showing that it too will pack a punch (albeit still noticably a little more clunky/slow to start up).

Beyond that, there are efforts to let you use HTML/JavaScript as a stand alone application, in terms of Titanium and XNeat Application Builder, the latter of which works very well, in addition to Air, of course.

Given all of this, and given the potential power of Flash/Silverlight/JavaFX, particularly as they start to solve those final nagging problems that prevent seamless cross-browser, cross-operating system use, it does make one wonder a lot about how these technologies and the the browser-based native technology (HTML/JavaScript) for RIAs will co-exist in the future, and what ground they will take from the existing RIAs.

Introduction

In this part 2 of my series "JavaScript and the Local File System", I am going to discuss a little about paths and provide some code that I use for manipulation of paths.

Anyone reading this is more than likely to be much more knowledgeable on the subject than I, so these notes are really a reminder for me.

I have to say at the outset that I have only ever used Windows (hey, I'm a PC), and don't have access to an Apple Mac or Linux (athough I could do a virtual Linux install).

I don't have any bias against any system, as, frankly, they all do what I need anyway.

However, all of my demonstrations have only been tested in Windows. Sorry!

Anyway...

File System Concepts

These are the basic constituents of a file system, as I understand it:-

File System

This can mean either:-

  • The sum total of all different forms of file storage available to a computer.
  • The structure of folders and files on a given drive or volume.
Drive (aka Volume)

Refers to either:-

  • a physical storage medium in its entirety such as a disk, or flash memory.
  • if that physical storage has been logically separated (partitioned), each such partition.

Windows uses the word Drive, whereas Unix based systems (Linux, Mac), use the word Volume.

A drive or volume is normally one of the following, or a partition on one of the following:-

  • Fixed : a fixed disk located inside your computer
  • Removable : removable storage, such as a memory stick, external hard drive, floppy disk(!)
  • Network : a fixed disk accessible on a remote computer through your network
  • CD-ROM (DVD, Blu-Ray) : an optical disk
  • Ram Disk : part of your computer's memory logically treated as a drive
Folder (aka Directory)

A folder within a drive, which may have parent folders, in which files are stored.

Windows uses the word "Folder" (and sometimes "Directory"; Unix based systems use the word "Directory".

Root Folder

For Windows, each Drive will have a "root folder" which "contains" all other folders and files.

For Unix systems, there is one combined file system, which covers all volumes attached from time to time. The root folder is always /.

File

A collection of bytes making up a text or binary file.

What is a path?

Fundamental to using your computer system's file system is understanding the concept of a PATH.

Generally a path is a string that uniquely locates something in your computer's file system (such as a disk, file or folder).

Some examples:-

Windows

A file named File.txt in Sub-Folder on the C: drive.

C:\Sub-Folder\File.txt

A file named File.txt in Sub-Folder in Shared Folder on the remote server called Remote Server Name.

\\Remote Server Name\Shared Folder\Sub Folder\File.txt

Mac Classic

Instead of using letters, I think Mac uses names throughout:-

Disk Name:Sub-Folder:File.txt   -- (Local Drive)
Network Disk Name:Shared Folder:Sub-Folder:File.txt -- (Network Drive)

Mac Unix Style

/Sub-Folder/File.txt -- (Boot Disk)
/Volumes/Disk Name/Sub-Folder/File.txt -- (Another Local Hard Disk)
/Network/Server Disk Name/Shared Folder/Sub-Folder/File.txt -- (Network Drive)

Linux Unix Style

/Sub-Folder/File.txt -- (Boot Drive)
/mnt/Disk Name/Sub-Folder/File.txt -- (Another local Hard Disk or Partition)
/net/Server Disk Name/Shared Folder/Sub-Folder/File.txt -- (Network Drive)

The file: scheme

When referring to files in an internet setting, using a url, then they are normally preceeded with file:///. I will not be considering this type of path specification.

Absolute and Relative Paths

Path strings can be absolute or relative.

  • Absolute means that it is a complete specification of the source of the file, starting with some root drive or volume point.
  • Relative means that the path is relative to some other path.

For instance in Windows:-

C:\Sub-Folder\Sub-Sub-Folder\File.txt  -- Absolute

Sub-Sub-Folder\File.txt  -- Relative to something else

In most systems relative paths are usually relative to some "home" folder/directory which is stored as an environment variable somewhere.

In all of my considerations here, I am only going to consider absolute paths.

There is a whole lot of detail to do with relative paths which I am going to ignore.

Names

Files and Folders

The rules about the names you use in your paths (drive, folder, and file names) are complex and can be found here: filenames.

In general, it seems appropriate to observe the following rules in all systems, for consistency:-

  • Don't use the characters |\?*<":>+[]/
  • Don't use the characters 0x00 - 0x1F (US ASCII) - the control codes.
  • Keep your paths (drive + folders + file name) to 259 characters maximum. This means that if you have a deep folder structure, try to avoid file names that are very long.
  • Treat file names as case insensitive.

In fact unix is more liberal than this, so I view this as the windows lowest common denominator.

Files

Files are identified by:-

  • a name, followed by
  • an optional three or four letter extension, to identify its type (e.g. .doc, .txt, .html)

This extension is optionally hidden by the system.

This convention is observed by the main operating systems, I believe.

Windows Paths

Now a little more detail about windows paths and file names.

Drives

A drive is normally identified by a letter, followed by a colon.

C:

The reason for the colon is to avoid ambiguity, as a colon is not a valid character in a file or folder name.

If I had:-

C

This might be a relative path referring to the folder named C in whatever current folder the system uses for relative paths.

E.g. if the current folder were A:\Julian then a path C might refer to A:\Julian\C.

Root Folder

The root folder is identified by the drive letter and then a separator which is normally \ but can also be /.

C:\

Sub-Folders

Sub-folders are identified by a name.

C:\My Sub-Folder

Sub-folders of sub-folders are again separated by a separator which is normally \ but can also be /.

C:\My Sub-Folder\My Sub-Sub-Folder\My Sub-Sub-Sub-Folder

So in effect the separator, \ or /, represents the concept what is inside.

Files

File locations are identified by the drive letter, folder path, and file name.

C:\My Sub-Folder\My File.txt

Of course, if your file does not have a three letter extension, then it might be confused with a folder.

C:\My Sub-Folder\My File

The computer will treat it as a folder if you put another separator after.

C:\My Sub-Folder\My File Actually A Folder\

Network Drives

A shared drive which is on a computer elsewhere on the network, or even a specific folder on that shared drive, is primarily identified by a UNC path, which starts with a double back-slash \\:-

\\server\share\a file in shared folder.txt

This can be mapped onto a drive letter in the computer's file system, and treated as if it were a drive on your computer, accessible from the list of drives, rather than from the list of network places, e.g.:-

J:\a file in shared folder.txt

Unix / Linux

Unix does not have the concept of drive letters; instead, drives (volumes) are referred to by assigned name.

In addition, instead of each volume having its own absolute path, the path system works thus:-

  • All absolute paths start with /. This refers to the root folder of the volume which is the boot volume.
  • A file in a sub folder could be /sub-folder/file.txt

If you have other volumes attached to your system, such as a partitioned hard disk, or memory stick, or a shared network drive, then their file systems are accessed through a directory in the root file system.

For instance, there are (in LINUX) two special directories mnt (for additional local drives) and net for network drives, accessed thus:-

  • /mnt/volume-name/....file system of that volume
  • /net/volume-name/....file system of that volume

Similarly for Apple's UNIX-based OSX.

Which directory the new volume's file system is attached to can vary, so for any path it is not possible, from inspection of that path alone, to determine what is the volume name.

Mac

Mac had a historical convention, and now also follows a Unix convention as well, as OSX is Unix underneath.

You are advised to avoid both ":" and / in your file names.

Classic

The classic uses d : separator, and each volume had its own absolute path by name (similar to drive letters for Windows).

drive-name:sub-folder:file.txt
network-volume-name:sub-folder:file.txt

OSX Unix

Underneath it all is a full Unix-style path (and the above are mapped onto this):-

/sub-folder/file.txt  -- on the root drive
/Volumes/volume-name/sub-folder/file.txt -- another mounted volume
/Network/volume-name/sub-folder/file.txt -- a networked volume

You will noted that instead of mnt and net we have Volumes and Network.

A useful quote:-

In the Mac OS tradition, each volume is thought of as an independent entity; it shows up on the desktop as a separate icon, which contains everything on that volume.

The unix tradition, on the other hand, does not think of volumes (also known as "filesystems" in unix-speak) as independent entities in this way. Whichever volume the computer booted from is the main (or "root") filesystem, and all other mounted volumes show up like folders ("directories" in unix-speak) somewhere inside the root filesystem (or even inside each other). Usually, they're mounted in a standard location (unix traditionally uses a directory named "mnt" for this; in Mac OS X it's named "Volumes" instead), but not always.

Source

Apple's own guidance on this area can be difficult to follow.

A generalised path manager

What I need is some code to help me manage these various path options in a simple way, a bit similar to the Java File object.

Why do I need it? Well mainly because I want something to help me inspect path strings, and extract their components. I have an ambition to write a simple file synchroniser.

The following is what I came up with:-

Class Name
classAbsolutePath
Version

1.0 - December 2008

Author

Julian Turner, Derby, UK - December 2008

Source Code

classAbsolutePath.js

classAbsolutePath.js (outline view)

classAbsolutePath.js (dynamic outline view)

Test Page

Test Page

Constructor
classAbsolutePath(path : String [, type : int, separator : String])
path : String

The full path, e.g. C:\One\Two.txt or /One/Two.txt or \\server\share\One\Two.txt

type : int [OPTIONAL]

One of three static type properties:-

classAbsolutePath.TYPE_WIN
classAbsolutePath.TYPE_WIN_UNC
classAbsolutePath.TYPE_UNIX

If you omit, then the class will make a decent attempt to guess.

separator : String [OPTIONAL]

The \ or / character.

If you omit, then the class will assume \ for windows and / for unix-like paths.

Generally, it is pretty safe to omit.

Errors

The class throws the following types of errors for all or almost all methods:-

TypeError

It cannot determine what type of path it is dealing with.

ParseError

It cannot parse a path or name.

SeparatorError

You try to specify an unrecognised separator.

ParseError is the most likely.

Static Methods
isValidPath(path : String, type : int) : Boolean

Is the supplied path valid for the given type?

isValidName(name : String, type : int) : Boolean

Is the supplied path valid for the given name?

Instance Methods
appendNewTarget(name : String)

Assumes that the current path is a path to a folder, and appends this name with a separator.

getTargetName() : String

The name of the file or folder pointed to, or "ROOT_FOLDER" for the root file or folder.

getTargetBaseName() : String

The name of the file or folder before any extension, or "ROOT_FOLDER" for the root file or folder.

getTargetExtension() : String

The name of the ".doc" extension or empty string.

getParentFolder() : classAbsolutePath

The path of the parent folder, or null.

getParentFolderName() : classAbsolutePath

The target name of the parent folder or "ROOT_FOLDER" or "" if you call this on the root folder.

getParentFolders() : Array.<classAbsolutePath>

The paths of all parent folders. 0 = root folder.

getParentFolderNames() : Array.<String>

The names of all parent folders. 0 = "ROOT_FOLDER".

isRootFolder() : Boolean

Is the path the root folder.

getRootFolder() : classAbsolutePath

Returns the path the root folder for this path.

toString(appendSeparator : Boolean) : String

Return the path.

Use appendSeparator if the path is to a named folder, and you want to add a separator to it.

The root folder always has a separator.

Points to Note
  • It only handles WINDOWS, WINDOWS_UNC, and UNIX-style paths. It does not parse the ":" classic mac path.
  • It only accepts absolute (fully specified) paths
  • The root folder is aways returned with a termnating separator.
  • It does not determine whether the path points to a directory or file. That is up to you, the implementer, to know already. A given path, such as C:\one\two.3 is always ambiguous as to whether two.3 is a file or a folder.
  • The code is a long way from being quality production code, but it works for me.
  • It will strip out any terminating separator at the end of the path, i.e. C:\one\two\ -> C:\one\two, unless this signifies the root folder for the volume.
  • For mac and unix systems, it cannot work out mount names, but it can try. This is because mounted drives and appear anywhere in the file structure, although there are some conventional places to look.

Close

See you next time, when I will start looking at the Scripting.FileSystemObject.


Comment(s)


Sorry, comments have been suspended. Too much offensive comment spam is causing the site to be blocked by firewalls (which ironically therefore defeats the point of posting spam in the first place!). I don't get that many comments anyway, so I am going to look at a better way of managing the comment spam before reinstating the comments.


Leave a comment ...


{{PREVIEW}} Comments stopped temporarily due to attack from comment spammers.