Hard link
In computing, a hard link is a directory entry that associates a name with a file on a file system. All directory-based file systems must have at least one hard link giving the original name for each file. The term “hard link” is usually only used in file systems that allow more than one hard link for the same file.
Creating an additional hard link has the effect of giving one file multiple names (e.g. different names in different directories) all of which independently connect to the same data on the disk, none of which depends on any of the others.[1] This causes an alias effect: e.g. if the file is opened by any one of its names, and changes are made to its content, then these changes will also be visible when the file is opened by an alternative name. By contrast, a soft link or “shortcut” to a file is not a direct link to the data itself, but rather is a short file that contains the text of a file name, or a location that gives direct access to yet another file name within some directory. The name contained in or referred to by the soft link may either be a hard link or another soft link. This also creates aliasing, but in a different way.
Every directory is itself a file, only special because it contains a list of file names maintained by the file system. Since directories themselves are files, multiple hard links to directories are possible, which could create a circular directory structure, rather than a branching structure like a tree. For that reason, the creation of hard links to directories is sometimes forbidden.
Multiple hard links – that is, multiple directory entries to the same file – are supported by POSIX-compliant and partially POSIX-compliant operating systems, such as Linux, Android, macOS, and also Windows NT4[2] and later Windows NT operating systems.
Support also depends on the type of file system being used. For instance, the NTFS file system supports multiple hard links, while FAT and ReFS do not.
Usage
On POSIX-compliant and partially POSIX-compliant operating systems, such as all Unix-like systems, additional hard links to existing files are created with the link()
system call, or the ln and link command-line utilities. The stat
command can reveal how many hard links point to a given file. The link count is also included in the output of ls -l
.
On Microsoft Windows, only NTFS implements hard links.[3] It is supported since Windows NT 3.1, although only since Windows 2000 there is a CreateHardLink()
API function to create hard links by giving a new filename to the Master File Table entry (analogous to inodes). The usual DeleteFile()
can be used to remove them. To create a hard link, one can use mklink /H
command on Windows NT 6.0 and later systems (such as Windows Vista), and in earlier systems (Windows XP, Windows Server 2003) using fsutil.exe hardlink create
.[4] Starting with Windows Vista, hard links are used by Windows Component Store (WinSxS) to keep track of different versions of DLLs stored on the hard disk drive. Unix-like emulation or compatibility software running on Windows, such as Cygwin and Subsystem for UNIX-based Applications, allow the use of POSIX interfaces under Windows.
OpenVMS supports hard links on the ODS-5 file system.[5] Unlike Unix, VMS can create hard links to directories.
The process of unlinking dissociates a name from the data on the volume without destroying the associated data. The data is still accessible, as long as at least one link that points to it still exists. When the last link is removed, the space is considered free.[6]
A process called undeleting allows the recreation of links to data that are no longer associated with a name. However, this process is not available on all systems and is often not reliable. When a file is deleted, it is added to a free space map for re-use. If a portion of the deleted file space is claimed by new data, undeletion will be unsuccessful, because some or all of the previous data will have been overwritten, and may result in cross-linking with the new data and leading to filesystem corruption. Additionally, deleted files on solid state drives may be erased at any time by the storage device for reclamation as free space.
Link counter
Most file systems that support hard links use reference counting. An integer value is stored with each physical data section. This integer represents the total number of hard links that have been created to point to the data. When a new link is created, this value is increased by one. When a link is removed, the value is decreased by one. If the link count becomes zero, the operating system usually automatically deallocates the data space of the file if no process has the file opened for access, but it may choose not to do so immediately, either for performance or to enable the undelete command.
The maintenance of this value guarantees that there will be no dangling hard links pointing nowhere (which can and does happen with symbolic links) and that filesystem file and associated inode will be preserved as long as a single hard link (directory reference) points to it or any process keeps the associated file open, relieving the burden of this accounting from programmer or user. This is a simple method for the file system to track the use of a given area of storage, as zero values indicate free space and nonzero values indicate used space.
On POSIX-compliant operating systems, such as many Unix-variants, the reference count for a file or directory is returned by the stat() or fstat() system calls in the st_nlink
field of struct stat
.
Example
In the figure to the right, two hard links, named "LINK A.TXT" and "LINK B.TXT", point to the same physical data.
If the file "LINK A.TXT" is opened in an editor, modified and saved, then those changes will be visible if the file "LINK B.TXT" is then opened for viewing since both filenames point to the same data ("opened", because, on POSIX systems, an associated file descriptor remains valid after opening, even when the original file is moved). The same is true if the file were opened as "LINK B.TXT" — or any other name associated with the data.
Some editors however break the hard link concept, e.g. emacs. When opening a file "LINK B.TXT" for editing, emacs first renames "LINK B.TXT" to "LINK B.TXT~", loads "LINK B.TXT~" into the editor, and saves the modified contents to a newly created "LINK B.TXT". Using this approach, the two hard links are now "LINK A.TXT" and "LINK B.TXT~" (the backup file); "LINK B.TXT" would now have just one link and no longer shares the same data as "LINK A.TXT". (This behavior can be changed using the emacs variable backup-by-copying
.)
Any number of hard links to the physical data may be created. To access the data, a user only needs to specify the name of any existing link; the operating system will resolve the location of the actual data.
If one of the links is removed with the POSIX unlink function (for example, with the UNIX rm
command), then the data are still accessible through any other link that remains. If all of the links are removed and no process has the file open, then the space occupied by the data is freed, allowing it to be reused in the future. This semantic allows for deleting open files without affecting the process that uses them. This technique is commonly used to ensure that temporary files are deleted automatically on program termination, including the case of abnormal termination.
Limitations of hard links
To prevent loops in the filesystem, and to keep the interpretation of ..
(parent directory) consistent, many modern operating systems do not allow hard links to directories. UNIX System V allowed them, but only the superuser had permission to make such links.[7] Mac OS X v10.5 (Leopard) and newer use hard links on directories for the Time Machine backup mechanism only.[8] Symbolic links and NTFS junction points are generally used instead for this purpose.
Hard links can be created to files only on the same volume. If a link to a file on a different volume is needed, it may be created with a symbolic link.
The maximum number of hard links to a single file is limited by the size of the reference counter. On Unix-like systems the counter is usually machine-word-sized (32- or 64-bit: 4,294,967,295 or 18,446,744,073,709,551,615 links, respectively), though in some filesystems the number of hard links is limited more strictly by their on-disk format. As of Linux 3.11, the ext4 filesystem limits the number of hard links on a file to 65,000.[9] Windows with NTFS filesystem has a limit of 1024 hard links on a file.[10]
Hard links were criticized as a "high-maintenance design" by Neil Brown in Linux Weekly News, since they complicate the design of programs that handle directory trees, including archivers and disk usage tools, such as du, which must take care to de-duplicate files that are linked multiple times in a hierarchy. Brown also calls attention to the fact that Plan 9 from Bell Labs, the intended successor to Unix, does not include the concept of a hard link.[11]
See also
- Fat link
- Symbolic link or soft link, which unlike hard link, only provides the text of an “actual” file name, not file data itself.
- NTFS symbolic link – the NTFS implementation.
- NTFS junction point – similar to an NTFS directory symbolic link, but not the same.
- alias (Mac OS) – a method for linking files introduced in classic Mac OS System 7, and still available in macOS, which is in some ways similar to a symbolic link. Note that true symbolic links are also available in macOS.
- shadow (OS/2) – the OS/2 implementation
- ln (Unix) – The
ln
command, which is used to create new links on Unix-like systems. - freedup – The
freedup
command frees-up disk space by replacing duplicate data stores with automatically generated hard links
Notes
- Pitcher, Lew. "Q & A: The difference between hard and soft links".
- "Link Shell Extension".
- "How hard links work".
- "NTFS Hard Links, Directory Junctions, and Windows Shortcuts". flexhex.com.
- "OpenVMS System Manager's Manual, Vol. I" (PDF). VSI. August 2019. Retrieved 2021-01-23.
- "AllDup - Duplicate File Finder Software (Freeware)".
- Bach, Maurice J. (1986). The Design of the UNIX Operating System. Prentice Hall. pp. 128.
- Pond, James (August 31, 2013). "How Time Machine Works its Magic". File System Event Store, Hard Links. Retrieved May 19, 2019.
- "Linux kernel source tree, fs/ext4/ext4.h, line 229".
- "MSDN - CreateHardLink function". Retrieved 14 January 2016.
- Neil Brown (23 November 2010). "Ghosts of Unix past, part 4: High-maintenance designs". Linux Weekly News. Retrieved 20 April 2014.