8.3 Alternate Names With ARX

Summary

8.3 names are a relic of the 1980’s, when the FAT12 and FAT16 file systems used in MS-DOS did not have a provision for arbitrarily-long filenames. In those days, names could have at most eight characters, followed by a single period, and up to a three character extension. That was 25 years ago.

More than a decade ago, 255 character names were introduced in Windows 95 and NT 3.5. These operating systems included a slot in their file systems for an “alternate name” formatted according to 8.3 name conventions. This alternate name allowed a universe full of legacy DOS applications limited to 8.3 names to work with the same files as new applications.

Even today, there are still applications that make use of 8.3 names via the alternate name slot in the file system. With the alternate name provision, every name has a primary “long” name and an alternate “8.3” name. The alternate name is assigned by the server/filer when the primary name exceeds the limitations of the 8.3 format. The two names can be used interchangeably by applications. From the Microsoft Windows command prompt, the alternate names are made visible with the “/x” option to the standard “dir” command:

The challenge when implementing the ARX is that it is unaware of these alternate names generated by servers and filers as it only tracks the full name. This note describes what behaviors the ARX will exhibit in various deployments that will include 8.3 style alternate naming, as well as debug and tools that are available to determine potential issues when using the ARX.

Alternate Name Access through the ARX

The ARX does not support access to Alternate 8.3 style names as it does not track alternate names within its metadata. Metadata only tracks the original file names, and the ARX is unaware of any alternate naming conventions that may exist within the physical file system. This topic has caused some confusion as some have believed the ARX does not support 8.3 names. The ARX certainly supports 8.3 naming conventions: that is you can create JIM.DOC, or even JIMSRE~1.DOC through an ARX. What the ARX does not support is access to the Alternate names that follow 8.3 naming conventions, you must access via the original file name which may or may not be a long name.

8.3 names would typically be encountered in an environment where legacy 16bit applications don’t know how to access long names. The applications are unaware and unable to access file and directory names 4 longer than the 8.3 naming convention, and thus use the Alternate name for access. Applications or users that require access to Alternate names that adhere to 8.3 naming standards will not work through the ARX.

When a legacy 16 bit application is encountered we have typically found that all the file names generated by the application are short names, so the application does not typically need access to the Alternate name slot, as it can very easily access files using their native short names that adhere to the 8.3 naming convention. What may cause issues after ARX implementation is that directory names that are longer than the application can handle must be accessed via the alternate name, which is not supported by the ARX.

Imagine the case where a legacy application is accessing a path with a long directory name such as “S:\Documents and Settings”, since the directory name is too long for the application to access, it can easily access the directory using the shortened Alternate name “S:\DOCUME~1” as seen in the “Dir /x” output in the parent directory.

S:\>dir /x
  Volume in drive S has no label.
  Volume Serial Number is 880D-F48B

  Directory of S:\

10/02/2007  10:38 AM    1A513A~1     1a513ac2a97792da6bd78e
10/02/2007  11:02 AM    2B32C6~1     2b32c6c0258be092a987839d
10/02/2007  11:00 AM    4B921D~1     4b921d744416247c09c2b135
10/02/2007  10:38 AM    77AADF~1     77aadfbb56b3fe3750904b319e709c92
10/02/2007  11:01 AM    841B0B~1     841b0b287371738e6c5fb705cc
10/02/2007  10:41 AM    A61B05~1     a61b05d71b8c0b19fb477cea50006a
08/11/2004  06:15 PM    0            AUTOEXEC.BAT
07/17/2006  09:51 AM    BITMAP~1     Bitmap wallpapers
09/22/2009  09:32 AM                 Config.Msi
08/11/2004  06:15 PM    0            CONFIG.SYS
09/16/2007  07:02 PM                 dell
09/21/2009  03:31 PM    DOCUME~1     Documents and Settings

When accessing through the ARX the alternate names do not work, so one potential solution to this problem is to shorten the directory names in the path that the application must access so they conform to 8.3 naming conventions. Renaming the “S:\Documents and Settings” directory to a short name, such as “S:\Documents” will allow the application to access via the real directory name, and alternate naming will not be required through the ARX.

Another option would entail creating a CIFS share that maps deeper into the file system, thus bypassing any long parent directory names above the shared directory. The application would no longer have to access alternate names for long directory names, because the share provides access below them.

The last option if neither of the above options is viable would be to remove the application from the ARX virtualization and allow the application to go direct to the storage. This will require some analysis to determine what parts of the file system are used by the legacy application and a determination must be made if they can be relocated.

In some cases the ARX may be able to migrate portions of the physical file system to a different storage device. Then that location is removed from the ARX Managed Volume and brought into an ARX Presentation Volume which does not have metadata, and access to alternate naming is supported below the ARX attach point level when Presentation Volumes are in use.

Alternate Name Collisions during Migration

All long file names will have an alternate 8.3 short name if the file system allows this behavior. Some file system such as Windows NTFS can actually disable alternate name generation. Sometimes during a migration there may be a “real” file name that matches an 8.3 alternate name pattern. This can result in file name collisions when the ARX is trying to move data.

The example would be a real file called THISIS~1 (alternate name = THISIS~1) and some long filename called Thisisalongname.txt (alternate name = THISIS~2). Consider the case where the ARX migrates the long file name to another storage device, and it receives the 8.3 alternate name THISIS~1 on the new target because that name is not in use within the physical file system. Now the ARX tries to move the real file called THISIS~1 but it can’t because of a name collision with the alternate name  THISIS~1 within the same file system. The migration will fail and the ARX will log a name collision.

There are a few ways to address this issue and get the file to migrate. The first option is to migrate the file with the real name THISIS~1first, then when the file Thisisalongname.txt is migrated it will have to choose a different alternate name because THISIS~1 is already taken within the file system.

The two other approaches to handle the case where a file called Thisisalongname.txt has an alternate name of THISIS~1.TXT on a share, and you need to migrate THISIS~1.TXT onto it.  You can rename Thisisalongname.txt and then migrate the file, the rename will cause a different 8.3 style alternate name to be generated so the collision no longer exists. The other approach is to rename THISIS~1.TXT and then migrate it.

Other approaches are sometimes used to deal with name collisions. Files whose real name match 8.3 naming standards can be excluded from a migration so that collisions do not occur, or they can be migrated to their own file system within the Managed Volume to avoid collisions.

Filesets can be created to match real 8.3 names, not the Alternate. There are a few variants of this fileset. The fileset below can be setup to exclude 8.3 style names from a migration, or possibly steer them to their own physical file system to avoid collisions.

filename-filesetExclude8Dot3TildeFiles 
  name regexp not "^[^.]{0,6}~[0-9](\.\w+$|)$“ 
  recurse
  exit

Another variant is used in EMC environments or greater numbers of collisions, it will match all 8.3 style names not just the ones with the ~

filename-fileset Exclude8Dot3Files 
  name regexp not "^[^.]{1,8}\.?[^.]{0,3}$"
  recurse
  exit

If a migration from one share to another was being performed these filesets could be used to migrate 8.3 real names first, and then a rule could be setup for the remaining files. The logic would need to be inverted so that it matches all files that are 8.3 format rather than exclude them. By migrating the real 8.3 names first no alternate name collisions will occur.

Shadow Volume and Alternate Names

ARX Shadow Volume replication is also subject to 8.3 alternate name collisions when replicating files to remote storage. In releases prior to ARX DMOS v5.00.000 where an automatic mechanism for dealing with name collisions is not available, a best-practice is proposed. In environments where these names are problematic, do one of the following:

  1. avoid shadowing the names altogether.
  2. define a tiered policy on the shadow volume such that 8.3 FGN-format names can be separated from their long-named counterparts

Avoid Shadowing 8.3 FGN-Format Names

In some environments, the 8.3 FGN-format names are undesirable. The following fileset will exclude most 8.3 FGN-format names.

filename-fileset Exclude8Dot3TildeFiles
  name regexp not "^[^.]{0,6}~[0-9](\.\w+$|)$"
  recurse
  exit

For environments with EMC filers or greater numbers of collisions, one may choose to exclude all 8.3 format names:

filename-fileset Exclude8Dot3Files 
  name regexp not "^[^.]{1,8}\.?[^.]{0,3}$" 
  recurse 
  exit

Tiering in the Shadow Volume

A tiered configuration can be defined within the shadow volume such that new files, and files with 8.3 FGN-format names are placed on tier 1, and all other files are moved to tier 2. This solution should not be adopted universally without proper consideration, as it necessitates an additional data copy for nearly every shadowed file once it reaches the shadow volume.

There are several important points to consider when configuring this solution:

  1. The shadow volume algorithm uses a temporary file during the copy phase, prior to publishing the file to its final location in the shadow volume. This temporary directory and its contents must be placed on tier 1.
  2. When a file placement rule detects that a file has been published to the shadow volume (a rename operation), it will immediately move the file to the correct share.
  3. If shadow volume attempts to publish a file with an 8.3 FGN-format name, and it conflicts with the alternate name of an existing file, the shadow copy rule will fail temporarily. By the next shadow attempt, the conflicting file should have been moved to tier 2, and the rename operation will succeed.
  4. It is recommended that the administrator performs an initial shadow copy run that excludes 8.3 FGN-format names. Once this process is complete, a share for tier 1 can be added such that the bulk of the data stays resident on tier 2.

The logic for the filesets used in the previous section must be inverted to use when defining tier 1:

filename-fileset Include8Dot3TildeFiles 
  name regexp "^[^.]{0,6}~[0-9](\.\w+$|)$"
  recurse
  exit

Tier 1 must also include the shadow volume temporary directory:

filename-fileset ShadowTempDir 
  path "/.acopia_shadow/temp/" 
  recurse
  exit

(Note that it is important to restrict this fileset to the “temp” directory.)

Shadow Volume Enhancements to Deal with Name Collisions

The ARX Shadow Volume functionality has been enhanced in DMOS v5.00.000 so that alternate name collisions can be handled by the replication engine and no special filesets are needed. The new functionality applies under the following conditions,

  • - The failure is name collision.
  • - The protocol is CIFS.
  • - The file name is 8.3 name.

Under these conditions, shadow volume target takes the following steps to resolve the name collision,

  • - The shadow volume target queries the share containing the file by passing the 8.3 name as the file’s full name in an attempt to find the long name of the file that caused the original rename failure.
  • - If shadow volume target could not find such a file, it fails the error retry and returns the original error.
  • - If the shadow volume target finds the file, it moves the file to the holding area and changes the shadow volume metadata record to point to the new location.
  • - Shadow volume target moves the 8.3 name file to desired location and writes the record to metadata database.
  • - Shadow volume target moves the long name file back from the holding area and updates the metadata database.

At this point, the filer assigns a different FGN to the long file name, and the name collision is resolved. This collision resolution process is disabled by default. A new CLI command is provided to enable it.

Example

Take the example where the following file exists on the source

 - / usr/john/my_three_day_vocation_in_hawaii.doc

And the following file on target,

 - /arx-shadow/john/my_three_day_vocation_in_hawaii.doc

And target filer assigns FGN ~my_t000.doc to my_three_day_vocation_in_hawaii.doc.

When the files on source side change to,

 - / usr/john/my_three_day_vocation_in_hawaii.doc
 - / usr/john/~my_t000.doc

Shadow volume copies the new file ~my_t000.doc to target temporary area, and it fails to move it to /usr/john/ due to name collision. In this case, it takes the following actions to correct the failure:

  • - Queries the share with FGN / usr/john/~my_t000.doc, and the share returns file /usr/john/my_three_day_vocation_in_hawaii.doc
  • - Moves my_three_day_vocation_in_hawaii.doc to holding area with all necessary metadata database changes.
  • - Moves ~my_t000.doc to /arx-shadow/usr/john with all necessary metadata database changes.
  • - Moves my_three_day_vocation_in_hawaii.doc back to /arx-shadow/usr/john with all necessary metadata database changes.

Version 8.3 Reporting

The ARX will log an error when a name collision occurs during a migration. Below is an example of a migration report on the ARX:

A log entry should also be seen in the ARX’s syslog similar to the one noted below:

2011-12-09T22:30:44.571+0000:arx1:1-1-ACM-17612:DNAS_PGRP-1-NOTE-
ALTNAME_COLLIS_MIGDST:: Migration aborted due to an unexpected match to 
CIFS alternate name '/directory/123456~1.txt' on the destination share 
'share02'.

Data Manager Logging of 8.3 (Not Alternate) Names

F5’s Data Manager Application can scan a file system, and note files it thinks may be potential candidates for 8.3 name collisions. In other words these are real file names that match the typical 8.3 naming convention, including the ~. NOTE: Data Manager is not detecting the presence or use of alternate 8.3 style names in the environment. Data Manager has no way of knowing if alternate names are actually used, so in no way should this be used to try and determine if legacy applications are in use.

Data Manager uses the following algorithm to determine 8.3 style names:

  1. DM gets the name reported to it by the FindFile call
  2. DM checks to see if the name provided is in 8.3 format
        a. 8.3 names have the following characteristics:
            i. maximum one period
            ii. maximum 8 character base name
            iii. maximum 3 character extension
            iv. no CIFS reserved characters
        b. If so, we log it to the RTF file and count it
  3. If it is in 8.3 Format we check to see if it fits the FGN format
        a. all names that contain '.' before '~' are not FGNs.
        b. if a name contains '~' without a preceded '.',
        c. if it contains '~#', it is an FGN
        d. if the first character is '~', and the base name is exactly 8 characters long, it is an FGN.
        e. if it contains '~' in chars 2-7, unless '~' is last, it is an FGN.
        f. if it contains '~' as the 8th character, it is an FGN.

Data Manager will provide summary counts for all file names that appear to match these patterns as seen below:

Clicking on the 8dot3 hyperlink will details the actual file names that were encountered. Again these are not actually Alternate names, these are real names that match the 8.3 algorithm that may potentially collide with an Alternate name if the target filer chooses to use one of these names as an Alternate.

Sample Data manager 8.3 log:

C:\i386\HEADSP~1.WM_ 
C:\i386\MINIPL~1.WM_ 
C:\i386\UTOPIA~1.WA_ 
C:\i386\UTOPIA~2.WA_ 
C:\i386\UTOPIA~3.WA_ 
C:\i386\UTOPIA~4.WA_ 
C:\WINDOWS\MPSReports\Cluster\bin\SECINS~1.EXE 
C:\WINDOWS\ServicePackFiles\i386\utopia~1.wav 
C:\WINDOWS\ServicePackFiles\i386\utopia~2.wav 
C:\WINDOWS\ServicePackFiles\i386\utopia~3.wav 
C:\WINDOWS\ServicePackFiles\i386\utopia~4.wav

Author Bio

Jim McCarron is a Manager of Field Systems Engineering for the F5 Data Solutions (ARX/Data Manager/ARX-CE) Sales business unit.

Published Feb 29, 2012
Version 1.0
No CommentsBe the first to comment