KnowDotNet NetRefactor

Using GetFiles With Three Character Extension Returns Extraneous Files

Exactly Three Character Extension in GetFiles

by Les Smith
Print this Article Discuss in Forums

Why am I getting files back from GetFiles(*.txt) that have an extension of ".TXT1...")?  Because there is a strange documented behavior when you use exactly three characters as a mask in GetFiles().

I recently used GetFiles("*.dat") and picked up files with the following extensions,...

      File.dat
      File.dat_pgp

and wondered what is going on?

After searching the web for a couple of minutes, I found that I was not alone.  So I looked at GetFiles() in MSDN and found that the problem is documented.  The quote from MSDN@ reads as follows:

"Wild cards are permitted. For example, the searchPattern string "*.txt" searches for all file names having an extension of "txt".  
The matching behavior of searchPattern when the extension is exactly three characters long is different from when the extension is more than three characters long. A searchPattern of exactly three characters returns files having an extension of three or more characters. A searchPattern of one, two, or more than three characters returns only files having extensions of exactly that length.

The following list shows the behavior of different lengths for the searchPattern parameter:

     "*.abc" returns files having an extension of .abc, .abcd, .abcde, .abcdef, and so on.
     "*.abcd" returns only files having an extension of .abcd.
     "*.abcde" returns only files having an extension of .abcde.
     "*.abcdef" returns only files having an extension of .abcdef. "

You can read more at
MSDN.

By the way, if you open a command window, and enter "DIR *.TXT", it will return files that have an extension of ".TXT*".  Again, this only time that you have a problem is when you use exactly three character extensions.  That's wonderful; it turns out that most common file extensions are exactly three characters.

So, what's the fix?  Actually, I don't think there is one.  So I have added code that looks like the following in C#:

  private FileInfo[] files = dir.GetFiles("*.dat");
  
foreach (FileInfo file in files)
  {

      
try
      {
        
if (file.Extension.ToLower().Equals(".dat"))
          ProcessOneFile(file.FullName);
      }
      
catch (System.Exception ex)
      {
        
// exception handling
      }
   }

Or, if you prefer the VB.NET version, here it is.

  Dim files() As FileInfo = dir.GetFiles("*.dat")
  
For Each file As FileInfo In files

      
Try
         If file.Extension.ToLower.Equals(".dat") Then
            ProcessOneFile(file.FullName)
        
End If
      Catch ex As System.Exception
        
' exception handling
      
End Try
   Next

In the applications that I am currently working with, I archive each file as I process it and that leaves me with the possibility of files staying in my input folder that tend to be forgotten.  If they are building up each day, you will have to deal with then sonner or later.  So, if you are moving files to an archive folder after they are processed, don't forget those files that didn't match your specified criteria for processing.

This is not the first, nor will it be the last time that I have to code around system anomolies.  But that is the life of a software developer.  Hope this helps you avoid some problems or at least give you the answer to the strange behavior that you are seeing.

Ask a Question, or give your feedback on my articles or products by clicking on My Blog.


  

Writing Add-Ins for Visual Studio .NET
Writing Add-ins for Visual Studio .NET
by Les Smith
Apress Publishing