As I mentioned earlier, the MFT is a database. The records in this database contain attributes that describe elements of the file system such as files, directories, filenames, and so forth. These attributes are defined by the $AttrDef metadata record. Each attribute is given a code number that identifies it in the MFT records.
Following is a brief description of each attribute. If you would like to see this list in a production file system, the attributes are defined in the $Attribute metadata record. You can't see this record from the operating system shell, but you can scan the NTFS volume using a hex editor and search for any of the attribute names. Remember that they are prefixed with a dollar sign ($).
The first part of every MFT record consists of header information that does not have an attribute. This information includes the MFT record number, an increment counter for tracking changes to the record, and the MFT record number of the directory that contains the file or folder represented by the record.
Contains the standard file attributes (Read-only, Hidden, System, and Archive) along with a set of timestamps. In addition, for NTFS 3.0 and later, $Standard_Information contains a pointer to a security descriptor in the $Secure metadata record. See section, "Security Descriptor," for more information.
Contains the name of the MFT record. If a record has a long name, a second $File_Name attribute is added with the short, DOS-compatible name (unless short name generation has been disabled).
This attribute is no longer used. Starting with NTFS 3.0, the $Security_Descriptor attribute was replaced by entries in the $Secure metadata record. The $Standard_Information attribute in an MFT record contains an index number for the $Secure entry that represents the security descriptor for the MFT record.
This attribute stores what is commonly thought of as the contents of a file. An MFT record can have multiple $Data attributes. See the "Named Data Streams" section for more information.
$Index_Root, $Index_Allocation, and $Bitmap.
These attributes are used to index MFT attributes for quick access. They are primarily used by directory records to index filenames, but they are also used to index other attributes to support features such as link tracking and reparse points.
This attribute contains a pointer to a volume, folder record, or device. When the record containing this pointer is opened, the file system opens the target of the pointer instead. The $Reparse_Point attribute supports features such as mount points and Remote Storage Services.
This attribute is used by the Encrypting File System. See Chapter 17, "Managing File Encryption," for details.
$Ea and $Ea_Information.
These attributes were originally used to support the High-Performance File System (HPFS) used by the OS/2 subsystem. The OS/2 subsystem and HPFS are no longer supported by Windows 2000 or Windows Server 2003.
Before looking at the way NTFS 3.x deals with security descriptors, let's consider the handling of attributes in general.
NTFS attributes are classified by whether they reside completely in the MFT record (resident) or they sit somewhere else on the disk with a pointer in the MFT record (non-resident).
The $Standard_Information attribute in each MFT record contains four timestamps that show the following dates and times:
When you first create a file or folder, all four timestamps are set to the same value. The Record Creation timestamp remains at this original value.
When the $Data attribute of a file record or the $Index_Root attribute of a directory record are modified, the Data Modification timestamp is updated. When any other attribute in the MFT record is modified, the Attribute Modification timestamp is updated.
You would expect that the Last Access timestamp would get updated each time the record is opened, but this would put too great a load on the file system. Imagine how many updates would be necessary each time you open Explorer! Instead, the Last Access timestamp is updated only once in any one-hour period regardless of the number of times a file is touched. The last access time granularity for FAT32 is set for one day.
You can improve file system performance by eliminating the update to the Last Access timestamp completely. This might create auditing concerns, so do not make this change without some consideration for security. The Registry entry is as follows:
Key: HKLM | System | CurrentControlSet | Control | FileSystem
Data: 1 (REG_DWORD)
Some attributes, like $Standard_Information and $File_Name, must always remain resident. The system constantly needs information in those attributes and cannot be bothered to scratch around on the hard drive to find them. Other attributes, such as $Data or $Index_Root, can easily be moved somewhere else on the disk when they get too big to store directly in the MFT record.
When an attribute becomes non-resident, there is the possibility of fragmentation. This presented a problem for earlier versions of NTFS, where large security descriptors would be made non-resident. This forced the file system to go to some other place on the hard drive to check the security descriptor before it could allow access to the file or folder, which slowed file system performance considerably.
Windows 2000 also introduced permission inheritance. This made it possible to change the security descriptor of a parent folder and have the change "flow" down the tree to all files and subfolders. If Microsoft had implemented permission inheritance with the old resident/nonresident security descriptor paradigm, file system performance would have been truly pathetic.
Instead, NTFS 3.0 introduced a new MFT metadata record called $Secure. This record is essentially a hidden database containing all the security descriptors for the file system plus a couple of indexes to help find entries quickly. Figure 15.8 shows the layout of this database.
Figure 15.8. Layout of the $Secure database.
The security descriptors themselves are stored in a data attribute called $SDS. Along with the security descriptor itself, the system stores these additional values in each $SDS entry:
A hash of the security descriptor to use as an index key.
A sequence number assigned to each security descriptor to act as an identifier.
The security descriptor size.
The offset of the security descriptor from the start of the $SDS data stream.
These last two entries are important because security descriptors can be different sizes so that they can't be found by a simple fixed-record lookup algorithm.
Security Descriptor Links
The $Standard_Information attribute for each MFT record contains the sequence number of its assigned security descriptor. This makes it simple to implement inheritance. When a new file is created, NTFS links the MFT record to the same security descriptor as the other files in the folder by putting the same $SDS sequence number in the $Standard_Information attribute.
If you modify the security descriptor of a file or folder by adding entries to its access control list (ACL), or you select the Don't Inherit option in the ACL Editor to break the chain of inheritance, a new security descriptor is created in $SDS and the $Standard_Information attribute in the MFT record is updated to contain the sequence number assigned to the new security descriptor.
In practice, NTFS uses separate security descriptors for folders and files. So, if you do not make any changes to the security permissions for files or folders in an NFTS volume, there will be just two entries in $SDS, one for all the folders in the volume and one for all the files under those folders.
Security Descriptor Lookups
The $Secure record maintains two indexes named $SDH and $SII. These indexes are shown in Figure 15.8 and are described as follows:
The $SDH index helps to limit the number of security descriptors by recycling those that are already in $SDS. If you set the contents of a security descriptor for a file or folder to match those of another file or folder, including all the inherited permissions, the system uses the sequence number of the existing $SDS entry.
To make this trick work, the system needs a way to quickly scan for identical security descriptors. That is where the hash index in $SDH comes into play. The statistical likelihood of two security descriptors having an identical hash is vanishingly small, so the file system calculates the hash for a new security descriptor then scans for a match in $SDH. If it finds one, it uses the sequence number in the index to update the $Standard_Information attribute in the MFT record.
What is the end result? An NTFS volume has a compact set of security descriptors that is fully cross-indexed to use for controlling access permissions to files and folders. This improves performance and simplifies inheritance.
Security Descriptor Highlights
You don't need to remember the details of the various components of the $Secure database. Here is a quick checklist of the operational requirements based on how the $Secure database works:
If you add ACE entries to the ACL of a file or folder while retaining inheritance, a new $SDS entry is created with a security descriptor that has an ACL reflecting the inherited and explicitly applied access control entries (ACEs).
If you move a file to another location on the same volume, the MFT record location is not changed and neither is the $SDS index number. Therefore, the file retains its old security settings plus inherits the new settings from its new folder.
If you copy a file to another location on the same volume, a new MFT record is created. This new record gets the same $SDS index number as its parent folder.
If you move a file to a different volume, once again, a new MFT record is created. The record gets the $SDS index number from its parent folder.
If you use xcopy /o to copy a file to a new location while retaining its security permissions, the result depends on whether the file had explicitly assigned entries in its security descriptor:
If the file had its own security descriptor in $SDS, the sequence number to that entry is copied to the new MFT record.
If the file had explicitly assigned ACEs along with inherited ACEs, the system scans to see if another $SDS entry has the same combination. If not, it creates a new $SDS entry and puts the sequence number in the new MFT record.
If the file had no explicitly assigned ACEs, the system uses the $SDS sequence number for the security descriptor used by the other files in the folder. (Remember that files use different security descriptor entries than folders.)
A file record is used to store data for access by the operating system or by user applications. A file record in the MFT also stores information about the file itself, such as the file's name and when it was created and how big it is and if it has any special attributes such as Read-Only or Hidden or Compressed.
Figure 15.9 shows the layout of a simple file record. It has a header and three attributes: $Standard_Information, $File_Name, and $Data.
Figure 15.9. Attributes of a simple file record.
Some of the more important elements of the file header include the following:
MFT record number.
A sequential number assigned to the record that helps identify it. Indexes such as directories use this information to correlate a filename to an MFT record.
Record type flag.
Indicates whether this is a file record or a directory record. If this flag is set to 0, it indicates that the record has been flagged for deletion and can be overwritten. Other record types are 1, File, and 2, Directory.
Actual size and Allocated size.
The actual size is the true number of bytes in the file. The allocated size is the number of bytes in the clusters assigned to the file. The larger the clusters, the more likely it is to have large differences in these numbers.
Update Sequence Number, or USN.
This acts as a version number. Each time the file is modified, this number is incremented by one. If a file is deleted and the MFT record subsequently reused, this number is set to a starting value of 2. This indicates to the system that no references are being made to deleted files.
This MFT attribute contains a set of timestamps, a pointer to the security descriptor for the MFT record, and a flag that is commonly referred to as the file attributes. Here are the important file attribute flag settings:
Standard DOS file attributes: Read-Only, Hidden, System, and Archive
Not Indexed (Content Indexer flag)
Keep in mind as you work with files and folders that these flags are kept in a separate attribute from the actual contents of the file, which are stored in the $Data attribute. For instance, if you copy a file to a new location, you build a new MFT record with a new $Standard_Information attribute and therefore new file attributes. But if you move a file, the MFT record remains the same and so does the setting of the attribute flags in $Standard_Information.
Every MFT record has at least one $File_Name attribute. The filename in this attribute can be 256 characters long, a limitation based on the single-byte Length field for the name entry.
A file can have more than one filename. For instance, if a name does not meet DOS 8.3 naming standards, the file system generates a short name and places it in a second $File_Name attribute. This supports DOS/Windows 3.x clients on the network plus any DOS applications at the console. See the following "Hard Links" sidebar for another example of multiple filenames.
If you set any of the $Standard_Information attribute flags on a folder rather than on a file, you tell the file system to apply the flag to any new files created in that folder. Explorer generally offers you the option of applying the flag to existing files, as well.
In Windows Server 2003 and XP, Microsoft changed the way Explorer displays the attribute setting in a folder from the traditional binary model (on or off) to a tri-state model (on, off, or content status indicator).
The status indicator for an attribute tells you if any files in the folder have that particular attribute set. For example, if you have a read-only file in a folder, the Read-Only attribute checkbox for the folder would be filled with color. Don't confuse this information setting with a checkmark, which indicates that the flag has been set on the folder and all new records will get that attribute.
I'm a big fan of an obscure 1970's comedy ensemble called Firesign Theater. If you like twisted, cerebral humor, their albums are now available on CD. One of my favorites is titled "How Can You Be in Two Places at Once When You're Not Anywhere at All?"
I bring this up because it relates to a long-dormant feature of NTFS called hard links. This feature permits you to create additional filename attributes and index them in different directories. In essence, you "create" multiple copies of the same file in different folders.
Prior to Windows Server 2003, the only way to create hard links was to use programming APIs. Windows Server 2003 introduces a new utility called Fsutil that makes it very easy to create hard links. The syntax is
fsutil hardlink create <target_file> <source_file>
where target_file represents the new name for the file you're creating and source_file is the name of the file that you're linking to. This file must be on the same volume as the target file.
When you use this command to create a hard link, NTFS adds the new filename attribute to the directory specified by the <target_file> entry. It also increments a hard link counter in the file header. When you delete an instance of the hard linked file, you only delete the filename attribute and the index entry in the directory. You cannot delete the file until all hard link instances have been deleted.
Short Name Generation
DOS compatibility is still important and will remain so for the next few years. (I'm sure that by 2020, the last lawyer who insists on using WordPerfect 5.1 will have retired.) DOS support requires 8.3 filenames. All the file systems in Windows Server 2003 will generate short, DOS-compliant filenames automatically when a long name is assigned. You can see the short names from the command line using dir /x.
Creating a short, 8.3 filename to represent a long 256-character name is not as straightforward as you might think. The system cannot simply take the first eight characters of the long name and the first three characters of the extension and call it a job well done. What if several files start with the same eight letters? Or how would you differentiate between .HTM files and .shtml files?
Windows Server 2003 uses two algorithms to generate short filenames. Both preserve the first letters of the long name for alphabetical sorting. The algorithm changes when five or more files in the same directory start with the same letters. Here's the algorithm for fewer than five files:
Delete all Unicode characters that do not map to standard ANSI characters.
Remove spaces, internal periods, and other illegal DOS characters. The name Long.File.Name.Test becomes LongFileName.Test.
Keep the first three characters after the last period as an extension. LongFileName.Test becomes LongFileName.Tes.
Drop all characters after the first six. LongFileName.Tes becomes LongFi.Tes.
Append a tilde (~) followed by a sequence numeral to the filename to prevent duplicate filenames. LongFi.Tes becomes LongFi~1.Tes.
Finally, convert the name to uppercase. The final short form of Long.File.Name.Test is LONGFI~1.TES.
The fifth and subsequent file with a long name that starts with the same six letters as other files in the folder is treated somewhat differently:
Drop Unicode characters, spaces, and extra periods (same process).
Keep the first three characters after the last period as an extension (same process).
Drop all characters after the first two instead of six. At this stage, Long.File.Name.Test5 becomes Lo.Tes.
Append four hexadecimal numbers derived via an algorithm applied to the remaining characters in the long filename. Long.File.Name.Test5 yields D623 and Long.File.Name.Test6 becomes E623. At this stage, the short name is LoD623.Tes.
Append a tilde (~) followed by a sequence numeral to the new filename just in case the algorithm comes up with duplicate names. LoD623.Tes becomes LoD623~1.Tes.
Finally, convert the name to uppercase. The final short form of Long.File.Name.Test5 becomes LOD623~1.TES.
Long filenames have been around for years now, so I'm sure you are well aware of the standard pitfalls. Keep the following important items in mind:
Moving long-name files between machines.
The short name algorithm used by the NT family of Windows works differently than the 9x family, and these both use a different algorithm than the two long namespaces in NetWare—Os2.nam for version 3.x and Longname.nam for 4.1x and above.
Excessively long names slow performance.
It's a good idea to keep names shorter than 32 characters for optimal performance. If a directory gets heavily fragmented, the short and long name attributes can get separated, causing a dramatic increase in file lookup times.
When you copy files from one place to another in Windows Server 2003, ordinarily the short filename is generated when the file is created in its new destination. If there is already another file in that location that has the same first six letters, the short name of the newly copied file will change. If you want to preserve the original short names when you copy, use the /n switch on the COPY or XCOPY command.
In Windows Server 2003, the backup API will also preserve the short names so that if you do a tape restore to a new location and there are already files that have the same first six letters, you must decide whether you want to overwrite those existing files.
Use caution in batch files.
The CMD.EXE command interpreter in Windows Server 2003 does not act the same as COMMAND.COM. For example, using CMD, you do not need to enclose long names with quotes when changing directories. You can enter cd c:\dir one and go right to the directory. This is not standard for all commands, however. If you enter del dir one*, you'll delete every file starting with dir and every file starting with one.
Special handling for file extensions.
File extensions also affect the operation of wildcards. Consider Long.File.Name.Test1.htm and Long.File.Name.Test2.shtml as examples. If you go to the command prompt and do a directory listing for *.htm, you get both files in the list instead of just the .HTM file. This seems like a fairly innocuous bug, but what if you enter del *.htm, thinking to get rid of only old .HTM files? You also delete the .shtml files, as well.
DOS applications delete long names.
If a DOS application changes a short name, the long name is deleted. This has the potential for upsetting Windows users.
A file record stores user data inside an attribute called $Data. The $Data attribute resembles a nestling. Although it's small, a few hundred bytes or so, it lives with the MFT record. When it gets too big for the nest, the file system locates some free space out on the disk and pushes the data out there. It becomes non-resident.
Data in a non-resident $Data attribute is stored in a contiguous set of clusters called a run. The portion of the $Data attribute remaining in the MFT contains a pointer to the location of this run. The pointer gives the following information:
The number of the cluster that starts the run. This is called the Logical Cluster Number, or LCN. The LCN is measured from the start of the volume, so the 1500th cluster from the beginning of the volume would have an LCN of 1500.
The length of the run in numbers of clusters. A 1024-byte file on a disk with 512-byte clusters would have a Run Length of 2.
If you add any data to the file, the file system simply appends the bytes onto the existing run and keeps expanding into new clusters as necessary.
You can eliminate the performance problems caused by short names by disabling 8.3 filename creation. This also reduces the disk space used to store index buffers. Do this only if you have no DOS/Win3.x clients. Use the following Registry entry:
Key: HKLM | SYSTEM | CurrentControlSet | Control | FileSystem
As the file grows, the file system sometimes encounters a cluster that is already occupied by another file or folder. When this happens, the file system selects the next available empty cluster and starts another run. A second pointer is added to the $Data attribute in the MFT to identify the location of the second run. At this point, the $Data attribute is said to be fragmented. Figure 15.10 shows a diagram of a fragmented $Data attribute.
Figure 15.10. Fragmented $Data attribute.
Unlike the pointer at the first run, which used a Logical Cluster Number to identify the start of the run, the pointer for the second and subsequent runs identifies the starting cluster in relation to the run before it. This is called a Virtual Cluster Number, or VCN. For instance, if Run 1 starts at cluster 100 and Run 2 starts at cluster 350, the VCN in the pointer to the second run would be 250.
As the file continues to grow and grow, it might encounter occupied clusters, forcing the file system to fragment the file even further. As the file becomes more and more fragmented, it requires more and more pointers in the MFT record.
At some point, the file becomes so fragmented that the pointers themselves will not fit in the 1K MFT record. When this happens, the file system creates another MFT record to hold the pointers. It leaves behind another type of pointer in an attribute called $Attribute_List. This pointer identifies the MFT record number that holds the additional pointers. This only happens when a file is severely fragmented.
MFT and Fragmentation
The MFT is a file, just like any other file in NTFS, and as such it can become fragmented if it is forced to grow into portions of the disk that already have clusters claimed by files.
MFT fragmentation can seriously degrade performance, so NTFS tries to prevent it if possible. The strategy used by NTFS to protect the MFT is the same strategy used by nations to protect their territory. NTFS sets up a defensive zone in front of the MFT and avoids this zone when assigning new data clusters on the drive.
By default, this MFT buffer zone takes up 12.5 percent of the volume size. On an 18GB volume, the MFT buffer would take up about 2.25GB. That is enough space for the MFT to hold more than 2 million files and directories without getting fragmented.
If the disk starts to get full, the file system behaves like a pirate nation and starts encroaching on the MFT buffer. At some point, if this encroachment continues, you are likely to get excessive MFT fragmentation. You should never let an NTFS volume get to less than 15 percent free space, with 25 percent free space being the optimal lower limit to allow room for defragmentation.
You can increase the MFT buffer size if you want, but the default should work fine as long as you don't overload your volumes. The following Registry entry controls the buffer size:
Key: HKLM | System | CurrentControlSet | Control | FileSystem
Data: 1-4 (REG_DWORD)
A setting of 1 represents the default 12.5 percent buffer allocation. Entering 2 carves out 25 percent. A 3 takes 50 percent, and a 4 takes 75 percent. The system works hard to keep the MFT buffer immaculate, so if you designate a bigger buffer, you should reduce your maximum volume loadings accordingly.
Named Data Streams
As we've seen, data saved to an NTFS file is stored in an attribute called $Data. As it turns out, an MFT record can have any number of $Data attributes. The default $Data attribute is like the star in an old-style spaghetti western. It has no name. When an application issues an API call to read from or write to a file, NTFS delivers the contents of this unnamed $Data attribute unless told otherwise.
In previous versions of NTFS, the MFT buffer zone could not be penetrated until all other space was used up. This put a serious crimp on defragmentation utilities because the buffer area might be the only free space left on the drive.
Under NTFS 3.1, Microsoft took a hint from the Bush administration's treatment of the Arctic National Wildlife Refuge. It degraded the MFT buffer zone somewhat, making it less of an absolute barrier and more of a helpful suggestion. This permits DEFRAG to use space inside the buffer zone for holding temporary clusters as it jockeys them around.
Any additional $Data attributes in a file must be identified by name. For this reason, they are commonly called named data streams. Here is an example of how to create named data streams:
Build a file named Superman.txt by echoing a few characters from the command prompt into a file as follows:
C:\>echo It's a bird. > Superman.txt
This creates a Master File Table record for a file named Superman.txt with an unnamed data attribute that contains the characters It's a bird.
Add a second data attribute by echoing text to a named stream in the same file as follows:
C:\>echo It's a plane. > superman.txt:stream1
Now, add a third data attribute with a different name:
C:\>echo It's SUPERMAN. > superman.txt:stream2
It can be something of a trick to view the contents of a named data stream. The application you use for viewing must be able to address the stream by name. Very few applications support this feature. In this simple example, let's use the MORE command to expose the named data streams:
C:\>more < superman.txt
It's a bird.
C:\>more < superman.txt:stream1
It's a plane.
C:\>more < superman.txt: stream2
Implementations of Named Data Streams
Named data streams have more uses than just parlor tricks. Microsoft uses them to support several features, such as Services for Macintosh (SFM). An SFM volume uses named data streams to support dual-fork Macintosh files.
Another feature that makes use of named data streams is Summary Information. You can see this feature by opening the Properties window for a file and selecting the Summary tab. Figure 15.11 shows an example.
Figure 15.11. Summary tab for a typical file.
When you store information about a file using Summary Information, the data is stored in named data streams using the GUID for the file as the stream name. Because named data streams are supported by earlier versions of NTFS, you can copy files from Windows Server 2003 and Windows 2000 servers to NT servers without losing the summary information. You cannot access Summary Information from NT, though, because the interface is not coded to look for it.
Named Data Streams and WebDAV
Another feature coming to prominence also makes use of named data streams. This feature is called Web-based Distributed Authoring and Versioning, or WebDAV.
WebDAV permits file manipulation using HTTP as the wire protocol. WebDAV is an open standard with lots of support from IETF (Internet Engineering Task Force), industry, and other movers and shakers. It will eventually replace FTP as the standard method for moving files around the Internet. WebDAV is discussed in detail in Chapter 16, "Managing Shared Resources."
The reason I bring up WebDAV at this point is because it uses named data streams in a way that may surprise you the first time you use the feature. To get an idea of how this works, set up a shared web folder on Windows Server 2003 that is running IIS. To do this, open the Properties window for a folder and select the Web Sharing tab. Click the Share This Folder radio button and accept the default options. This creates a virtual folder in the IIS metabase.
You must also configure IIS to publish WebDAV shares. It does not do this by default. Open the Internet Information Services console, right-click the server icon, and select SECURITY from the flyout menu. This launches a Security Lockdown wizard. Step through the wizard and in the Enable Request Handlers window, select Enable WebDAV Publishing under the ISAPI Handlers icon.
Instead of using a browser to connect to the web folder, use the new WebDAV redirector in Windows Server 2003 or XP by opening a command prompt and entering this command:
net use * http://<server_name>/<webshare_name>
You may be prompted for credentials. After the connection is established and the drive has been redirected, create a couple files in the network drive. If you were to take a look at the network traffic at this point, you would see that communications with this shared resource occur using HTTP rather than the Server Message Block (SMB) commands that would normally be used between Windows machines.
Open the Properties window for one of the files you created. You'll notice that there are only a few attributes you can change. They are Read-Only, Hidden, and Archive. You can also set the Encryption attribute, which will encrypt the temporary copy of the file on your machine and then copy the encrypted blob over the network to the web share. Read more about this functionality in Chapter 17, "Managing File Encryption."
If you set the Read-only or Hidden flags and click OK, you'll notice that the file shows these attributes in the Explorer window. Now go to the server and open the properties for the file. You'll notice that the attributes have not changed.
Here's the reason for this attribute duality. When you set an attribute via WebDAV, it is saved into a named data stream in the file. It does not touch the flags in the $Standard_Information attribute. This allows the system to manage WebDAV attributes via standard HTTP using methods such as PropFind and PropSet. It also means that WebDAV attributes must be managed separately from NTFS attributes. Keep this behavior in mind, because if you copy a WebDAV file to a location that is formatted with anything other than NTFS, you'll lose the WebDAV attributes.
Every database needs an index to locate records and speed lookups. NTFS is no exception. The most familiar attribute index is a directory, which indexes $File_Name attributes. The MFT permits indexing any attribute, though. Following are other attributes that also have indexes in NTFS:
These attributes are stored in the $Secure metadata record and indexed by $SDH and $SII.
Globally Unique Identifiers (GUIDs).
If a file record is the target of an object linking and embedding (OLE) link, it is assigned a GUID. These GUIDs are stored in an $Object_ID attribute in the file's MFT record and they are indexed in the $ObjID metadata record.
A file record header contains quota information that is indexed in the $Quota metadata folder.
Folders with symbolic links to other folders or volumes or devices contain a $Reparse_Point attribute. These attributes are indexed in the $Reparse metadata folder.
Directory Record Components
A directory record in the MFT is a special form of a file record. It has a header plus a $Standard_Information attribute and at least one $File_Name attribute. Instead of a $Data attribute, though, the MFT uses three additional attributes to store index information. Figure 15.12 shows how these attributes fit together in a typical directory:
This holds a copy of the indexed attribute. For example, in a Directory record, the $Index_Root attribute contains a copy of the $File_Name attributes from each file and folder in the directory. The $Index_Root attribute is always resident.
When the number of indexed entries grows to the point that the $Index_Root attribute cannot fit in its MFT record, the indexed entries are moved onto the disk into a set of 8K buffers. The $Index_Root attribute cannot be made non-resident, so the entries are put into a new attribute, $Index_Allocation, that contains the LCN of the start of the buffer run, the size of the buffer, and the length of the run.
This attribute assists in housekeeping by mapping out the free space in the index buffers.
Figure 15.12. Example Directory record structure.
When index entries are made non-resident, the $Index_Root attribute retains the first entry of each buffer to act as a sorting mechanism. These root entries form a b-tree, a structured format that speeds sorting. Figure 15.13 shows the b-tree structure for a shallow directory. A b-tree lookup doesn't take much work on the part of the file system. In the example, if they system is searching for a filename that is lexicographically less than 120.txt, it goes down the left path. Otherwise, its goes to the right.
Figure 15.13. Directory record showing b-tree entries and several non-resident index buffers.
Short Names and B-Tree Sorting
Short filenames add complexity to the b-tree sorting scheme. Filename entries are placed in index buffers alphanumerically without distinguishing between short names and long names. For example, if you have three long names such as Twilight of the Gods.txt, Twilight Double-Header.txt, and Twilight Zone.txt in the same directory, the short names TWILIG~1, TWILIG~2, and TWILIG~3 will sort to the top of the index buffer above the long filenames.
When you open a folder in Explorer or do a DIR from the command line, the file system retrieves both the short and long names. If you have many, many long filenames that all start with the same few letters, you can seriously degrade performance by forcing the system to do a full scan of all the index buffers looking for corresponding short names.
If you must continue to use short filenames to support downlevel clients or DOS applications, work hard to come up with naming schemes that do not use the same first letters. If that is not possible, consider breaking up your directory tree into many smaller folders to reduce the size of the index buffers.
Directories can become fragmented just like files. If a run of index buffers encounters a cluster owned by another file or folder, the file system is forced to start another run. This creates a second pointer in the $Index_Allocation attribute.
A heavily fragmented directory might fill the MFT record with pointers, at which time the system moves the pointers to another MFT record and leaves behind a pointer in an $Attribute_List attribute.
Fragmented directories slow performance as much or more than fragmented files because they force the file system to scrabble around on the drive collecting index buffers. Let's take a look at how NTFS handles defragmentation of files and directories.
Like Windows 2000, Windows Server 2003 and XP include a defragmentation utility using code licensed from Executive Software. This code makes use of API calls created by Microsoft. These API calls are designed to safely move clusters without taking a chance of file system corruption should there be a power interruption or system lockup.
The defragmentation engine consists of two executables, Dfrgfat.exe and Dfrgntfs. exe. The Dfrgfat engine works with both FAT and FAT32. The management interface is the Disk Defragmenter console, Dfrg.msc. Figure 15.14 shows a typical defragmentation analysis graph for a volume.
Figure 15.14. Defragmentation console showing typical fragmentation analysis graph.
For details on performing defragmentation, including how to use the new command-line defragger utility, see the "Defragmentation Operations" section later in this chapter.
Defragmentation in Windows Server 2003 has improved considerably compared to Windows 2000. Many of the nagging restrictions have been removed. Here is a quick list of the improvements:
The MFT can now be defragmented using the defrag API. Further, the MFT can be defragged while online. Earlier versions of Windows required running a commercial defragger at boot time to defrag the MFT. If you've ever sat through several nail-biting hours waiting for a boot-time defrag to finish so you could get a server back online, this is a welcome new feature.
The defrag API has been tweaked to permit access to corners and cracks of the file system that were inaccessible in prior versions. For example, previous versions of the API were unable to defrag heavily fragmented files that used Attribute Lists. It was also unable to defragment extensive bitmaps or reparse points. All of these elements can be defragged using the new API.
Compressed file defrag.
Defragmentation now works with compressed files, but you still cannot completely defrag a heavily fragmented compressed volume. In production, compressed volumes tend to get very fragmented even when you defrag regularly. The only workaround is to get a backup, wipe the volume, and restore from tape.
Improved encrypted file security.
Encrypted files are defragged without being opened. This eliminates a potential vulnerability where temp files created during defrag could expose sensitive data.
Less intrusive defrag.
The same API fix that protects encrypted files also makes it possible to run the defragmenter with just Read Attributes and Synchronize permissions. This makes defragmentation less intrusive, but you still need Administrative permissions to defragment.
Flexible cluster sizes.
The defrag engine now works with any cluster size. Previous versions were limited to a maximum of a 4K cluster. This means that you can increase cluster size on volumes holding large database files without worrying that you cannot defrag the files.
A new command-line interface called DEFRAG makes it possible to use batch files to kick off a defrag. Using the batch file and Task Scheduler, you can schedule periodic defrags to run after hours.
In addition to these features, a performance tuning service runs every three days to jockey frequently used files into more advantageous locations on the drive. This tuning service does not perform a full defragmentation but it does a nice job of tidying up on a regular basis.
Two defrag limitations remain:
You can prevent the paging file from becoming fragmented by defining the same value for the normal and maximum size. This prevents the file from growing and becoming fragmented. The simplest way to correct Registry fragmentation is to use the Pagedefrag utility from www.sysinternals.com.
Executive Software ships a commercial version of Diskeeper that has additional functionality and runs much faster than the engine included with Windows Server 2003. Other third-party defraggers include the following:
SpeedDisk uses a proprietary method for defragging that does not make use of the Microsoft APIs. This permits the product to defrag more thoroughly at each pass. Cautious administrators have expressed concern about this proprietary engine, but I have not heard of widespread problems. Make sure any defrag product you use has been certified for Windows Server 2003.
When it comes to disk capacity, you can never have too much and it can never be too fast. As of this writing, 15,000 rpm UltraSCSI 160 drives cost about 1.6 cents a megabyte. By the time you read this, drives with double that capacity will probably sell for about the same price.
Even at those low prices, storage isn't free, and the time it takes to install and configure new drives certainly comes at a price. This is especially true of drives on user machines, which often contain data that must be backed up prior to replacing and imaging the drive. Compression helps to resolve storage problems quickly and cheaply, but it has its limitations.
Using NTFS, you can compress an individual file, all the files in a folder, or all files on a volume. The compression algorithm balances speed and disk storage. The compression engine has been improved in Windows Server 2003 to permit compressing files of any size as long as the $Data attribute is non-resident. Earlier versions of NTFS were limited to compressing files of 16 clusters or more.
The maximum cluster size that can be handled by the compression API is 4K. If you do not plan on using compression and you have applications that store data in large files, you can format a volume with larger cluster sizes to improve performance.
The controls for the Disk Defragmenter are located at the key HKLM | Software | Microsoft | frg.
The path to the Disk Defragmenter console, Dfrg.msc, is stored in HKLM | Software | Microsoft | Windows | CurrentVersion | Explorer | MyComputer | DefragPath.
The default defrag path launches the console using the command %systemroot%\System32\drfg.msc %c:, which opens the console with the focus set for the C: drive.
A compressed file is identified by a flag in the $Standard_Information attribute, but it is actually controlled by a flag in the $Data attribute. This same flag is used for encryption and sparse files, so the three options are mutually exclusive.
When the compression flag is set, data is compressed and decompressed as it streams to and from the disk. This is also true for backups, so expect to widen your backup windows if you compress files on your servers.
If you set the compression flag on a folder, any new files created or copied to the folder are compressed. Existing files are not compressed unless you select that option when setting the flag on the folder.
See the "File Compression Operations" section later in this chapter for procedures to manage compressed files using Explorer and the command-line COMPACT utility.
File Compression and Performance
Compression exacts a significant performance penalty on file and print servers. Microsoft publishes numbers ranging from a 5 percent to 15 percent reduction in end-to-end data transfer times. My own experience points to much higher throughput degradation.
Exact numbers are difficult to quantify because busy production servers have hundreds of connected users doing who-knows-what with applications, personal databases, and data files from 1000 different vendors, and so on and so forth. Imposing compression on this mishmash generally makes you unpopular. Compressing personal files on a server makes better use of the feature than wholesale compression. Because users can compress their own files, take this into account when moving data.
You should never compress database files. The performance penalty of handling random file access into a compressed file is simply too high to be acceptable. Transfer the database files to a larger drive if you need more space.
File Compression Highlights
Working with compressed files in a production environment can result in some surprises. Here are some general operational guidelines:
Compressed volumes can become heavily fragmented. The defragmenter does a poor job of defragging compressed volumes. Consider this before you enable compression on a server. You may be buying yourself a long weekend of scrubbing a volume to and from tape to get the volume defragmented.
When copying or moving a file to another NTFS volume, a new file is created. The compression setting on the file is inherited from the new folder. This could result in decompressing a very large file into a tight volume. Use caution.
The same compression algorithm is used in NTFS 1.2, 3.0. and 3.1, so compressed files are accessible from NT4, Windows 2000, and Windows Server 2003. Make sure as you have the most current service pack on the earlier versions to get the most current NTFS driver.
Windows Server 2003 cannot read DriveSpace volumes. You must decompress any DriveSpace volumes prior to upgrading a Windows 9x or ME desktop.
The compression flag stays set when a file is copied to tape. When you restore a compressed file to an NTFS volume, it is compressed as it is saved to disk regardless of the compression setting of the parent directory.
Be careful when viewing the disk statistics reported by Explorer for compressed files. The file size parameter displayed in the UI shows the uncompressed size. You should always enable the Show Compressed Files in Color property in Folder Options.
Database and imaging applications typically allocate large amounts of disk space that they don't necessarily fill right away. Windows Server 2003 supports an API that can build file structures called sparse files.
A sparse file specifies a certain size for itself but does not actually claim the disk space until the file begins to fill up. Because sparse files are handled at the application level, the disk savings come without the performance penalty of regular file compression.
You cannot create a sparse file simply by filling a text file with zeros. Nor do you necessarily get a sparse file when you build huge databases with lots of wasted space in the records. The database application must use the sparse file API. The only Windows Server 2003 application that uses sparse files is the Content Indexer, which stores its catalog information in sparse format. No special settings or Registry hacks are available for sparse file handling.
You can convert a FAT or FAT32 partition to NTFS without losing data. That's the good news. The bad news is that you cannot change your mind and do the reverse. If want to go back to FAT or FAT32, you must back up your data, reformat the partition, then restore the data from tape.
Windows Server 2003 and XP contain many file system improvements that focus on NTFS conversion. That is because NTFS is available in XP Home Edition, making this the first time that NTFS has been available on a consumer product. Over the next few years, millions of Windows 9x and ME desktops will be upgraded to Windows XP. The conversion to NTFS needs to be a smooth one.
Conversion and Setup
One of the first improvements in NTFS conversion was to eliminate the need for conversion at all, at least in fresh installs of Windows Server 2003. The Setup program can now format an NTFS partition directly rather than going through an interim step of formatting with FAT/FAT32 and then converting. This means that the initial bulk file copy from CD puts the files directly into an NTFS file system, virtually eliminating the nasty MFT and system file fragmentation that normally occurs during Setup.
Still, not many shops do their server or desktop installations from CD. Most administrators prefer to install across the network to take advantage of scripted installations or Remote Installation Service (RIS).
If you install using the network, you must first format the system partition as FAT or FAT32 and then use WINNT to transfer the setup files to the local drive. Converting this partition to NTFS would normally cause fragmentation, but Windows Server 2003 improves the situation in two ways:
A new utility called OFORMAT (the Irish cousin of FORMAT) is available in the Deployment Tools. This utility is designed to place the FAT/FAT32 cluster boundaries where they can smoothly convert to NTFS.
New functionality was added to the CONVERT program to provide a "landing pad" that can hold the MFT during conversion. The MFT is copied to the converted volume only after all the other files have been converted. This all but eliminates MFT fragmentation.
There are also improvements in the conversion process itself. Conversion is much faster thanks to additional memory assigned to the task. Also, the existing FAT or FAT32 cluster size can be retained for cluster sizes up to 4K as long as the partition was formatted using Windows Server 2003 or XP. This is a big improvement over previous conversion programs, which insisted on using 512-byte cluster sizes regardless of the partition size.
The conversion can retain cluster sizes on volumes formatted by Windows 9x or NT only if the FAT/FAT32 cluster boundaries happen to fall at the required NTFS cluster boundaries. If this does not happen, conversion falls back to a 512-byte cluster size.
Conversion and Free Space
The conversion process preserves the integrity of the FAT right up until the last moment. All temporary writes are done to free space, so you need lots of elbow room on the volume to convert it. Use this rough computation as a guideline:
Multiply the number of files and directories on the volume by 1280.
Divide the volume size in bytes by 100. The lower limit is 1,048,576 and the upper limit is 4,194,304. Add this to the result of Step 1.
Divide the volume size in bytes by 803 and add to the result of Step 2.
Add 196,096 to the result of Step 3.
For example, the computation for a 4GB volume with 100,000 files looks like this:
100,000 * 1280 = 128,000,000
4GB * 1024 = 4096E6 / 100 = 40960000
4,096,000 + 128,000,000 = 132,096,000
4096E6 / 803 = 5,100,871 + 132,096,000 = 137,196,871
137,196,871 + 196,096 = 137,392,967
This volume needs approximately 134MB of free space to do the NTFS conversion. That represents less than 5 percent of the total space. You can get by with slim margins of free space, but for best results, give the conversion a lot more room than that. Otherwise, you will fragment the volume and spend lots of time defragging. You should also specify a conversion file in another partition for building the MFT to eliminate MFT fragmentation.
Conversion and File Security
Another weakness of previous conversion utilities was the way they left the file system completely open by putting the Everyone group on the security descriptor of every file and folder.
Windows 2000 improved the situation somewhat by making it easier to change the NTFS permissions at the top of the file system and letting inheritance take care of the rest, but you had to remember to do that extra step. Windows Server 2003 avoids the problem entirely by assigning the same default ACL to a newly converted partition that it assigns during the initial installation of the operating system. This consists of the following:
Full control permissions for the LocalSystem account, Administrator account, the Creator/Owner, and the Administrators local group
Generic Read permissions along with Write_Data/Append_Data special permissions for the local Users group
You can add other groups onto the permissions list after the volume has been converted.