Can you explain this bizarre crash in the .NET runtime? - c#

My C# application sends me a stack trace when it throws an unhandled exception, and I'm looking at one now that I don't understand.
It looks as though this can't possibly be my fault, but usually when I think that I'm subsequently proved wrong. 8-) Here's the stack trace:
mscorlib caused an exception (ArgumentOutOfRangeException): startIndex cannot be larger than length of string.
Parameter name: startIndex
System.String::InternalSubStringWithChecks(Int32 startIndex, Int32 length, Boolean fAlwaysCopy) + 6c
System.String::Substring(Int32 startIndex) + 0
System.IO.Directory::InternalGetFileDirectoryNames(String path, String userPathOriginal, String searchPattern, Boolean includeFiles, Boolean includeDirs, SearchOption searchOption) + 149
System.IO.Directory::GetFiles(String path, String searchPattern, SearchOption searchOption) + 1c
System.IO.Directory::GetFiles(String path) + 0
EntrianSourceSearch.Index::zz18ez() + 19b
EntrianSourceSearch.Index::zz18dz() + a
So my code (the obfuscated function names at the end) calls System.IO.Directory.GetFiles(path) which crashes with a string indexing problem.
Sadly I don't know the value of path that was passed in, but regardless of that, surely it shouldn't be possible for System.IO.Directory::GetFiles to crash like that? Try as I might I can't come up with any argument to GetFiles that reproduces the crash.
Am I really looking at a bug in the .NET runtime, or is there something that could legitimately cause this exception? (I could understand things going wrong if the directory was being changed at the time I called GetFiles, but I wouldn't expect a string indexing exception in that case.)
Edit: Thanks to everyone for their thoughts! The most likely theory so far is that there's a pathname with dodgy non-BMP Unicode characters in it, but I still can't make it break. Looking at the code in GetFiles with Reflector, I think the only way it can break is for GetDirectoryName() to return a path that's longer than its input, even when its input is already fully normalised. Bizarre. I've tried making pathnames with non-BMP characters in (I've never had a directory called {MUSICAL SYMBOL
G CLEF} before 8-) but I still can't make it break.
What I've done is add additional logging around the failing code (and made sure my logging works with non-BMP characters!). If it happens again, I'll have a lot more information.

You can try looking into the code for System.IO.Path.GetFiles() with .NET Reflector. From a quick look it apparently only calls String.Substring() to split something from the end of the path and adds it back near the end of the method. It checks Path.DirectorySeparatorChar (the backslash, '\') and Path.AltDirectorySeparatorChar (the slash, '/') to determine the index and length of the substring.
My guess would be that invalid or unicode file or folder names are confusing the method.

Just a guess... are any of the file names passed as arguments longer than 256 characters? The .Net framework standard System.IO functions cannot handle a file name that is longer than that.

Wow.. I don't think that's ever happened to me.
You're saying that it's only this one customer that this happens to?
Might want to start logging the path parameters, and set up the program to send the logs to you for analysis, I feel that the problem is in the format of the argument.
If this obfuscated code created from your own obfuscator, why don't you try test it on your machine 'un-obfuscated' with some of the parameters collected and see the result?
Isn't there anything in the Path namespace, like Path.Exist() or Path.IsValid() to give the parameter a check.. maybe there's funny '/' or '\' and other characters, so when the internal API parses each component, there's some sort of corruption in determining each portion of the path string because of funny characters? Just an observation, since the Substring is failing.
Hope that helps and good luck! Please let us know what the solution you've found is, as will definitely be an interesting one.

Perhaps you could provide some details about the customer having the issue. Things like:
1. OS name and version
2. OS Language
3. .Net version you are targeting, vs .Net version the customer is running.
There could be unicode characters in the directory path that are causing the string length to be off by one or more.
Another note: the exception text suggests that your program was written in managed C++. You aren't mixing in any unmanaged string manipulation are you?
I might suggest that if you can, modify your diagnostics to capture the actual path variable that causes the error.
A possible plausible explaination: http://support.microsoft.com/kb/943804/

First and only question should have been, "Have your run ChkDsk?"

Perhaps it has something to do with the obfuscator. And the obfucator screws things up. Try running the code without the obfuscator. And post your results.
edit:
Are you able to reproduce the crash?

Not sure this is related, but I'm using GetFiles in Visual C++, was getting it crashing when listing contents of C:, turned out I had a folder with messed up permissions from a previous install. I reclaimed the folder to my current user and it fixed the crash.

Is it a possibility to quickly code up a console app and run it in debug mode. Basically loop through the entire file directory using the GetFiles method. Maybe something will hit and you should be able to quickly locatye the offending file?

From the souce and your comments, I suspect a UNC path is causing problems, with a possible security permission or share permission issue. For instance, if the user turned off creation of 8.3 file names, you will definitely have UNC path issues because it causes the network provider to fail in retrieving proper file names in Windows 2000 and Windows XP. (I forget which service packs this bug was fix.)
Following is the source code of importance.
String tempStr = Path.InternalCombine(fullPath, searchPattern);
// If path ends in a trailing slash (\), append a * or we'll
// get a "Cannot find the file specified" exception
char lastChar = tempStr[tempStr.Length-1];
if (lastChar == Path.DirectorySeparatorChar || lastChar == Path.AltDirectorySeparatorChar || lastChar == Path.VolumeSeparatorChar)
tempStr = tempStr + '*';
fullPath = Path.GetDirectoryName(tempStr);
BCLDebug.Assert((fullPath != null),"fullpath can't be null!");
String searchCriteria;
bool trailingSlash = false;
bool trailingSlashUserPath = false;
lastChar = fullPath[fullPath.Length-1];
trailingSlash = (lastChar == Path.DirectorySeparatorChar) || (lastChar == Path.AltDirectorySeparatorChar);
if (trailingSlash) {
// Can happen if the path is C:\temp, in which case GetDirectoryName would return C:\
searchCriteria = tempStr.Substring(fullPath.Length);
}
else
searchCriteria = tempStr.Substring(fullPath.Length + 1);

Related

System.Runtime.InteropServices.COMException when using Interop Word document.SaveAs method

Essentially I need to create a PDF archiver that saves the content of a MailItem into a PDF file.
The code is below:
mailItem.BodyFormat = Outlook.OlBodyFormat.olFormatHTML;
string pdfPath = Path.Combine(fullPath + fileName + ".pdf");
Microsoft.Office.Interop.Word.Document doc = mailItem.GetInspector.WordEditor;
doc.SaveAs(pdfPath, Microsoft.Office.Interop.Word.WdSaveFormat.wdFormatPDF);
The exception happens on the .SaveAs method. What I've essentially boiled it down to, is that it has something to do with the file path, as I tried changing the file path to be shorter, to which the exception did not occur. The problem is, that it has to be in the longer file path structure. I did also consider that maybe it reached the max length of a file path (255), but from what I could tell by running pdfPath.Length, the length came out to be 81.
Does anyone have any ideas?
I looked into this a bit more, turns out that when working with file paths, you can easily use a single / for navigating the path, say for example: C:/User/Desktop.
This is the case, UNTIL you work with Word documents. Word doesn't appreciate single /, but prefers double \\. So, C:/User/Desktop becomes C:\\User\\Desktop.
Reason why I'm saying this, is because, while there is a File path Max length, I did not reach it.
The problem seemed to happen because there was a space in a folder name, like: C:/User/Desktop/Folder Name.
I then used the double backslash instead, and it worked perfectly.

How to avoid path or directories manipulation in the file upload function?

In the file upload function that I am working on it, one important issue is to have the path where I can save the image user uploaded. In the following code, I already specified the folder within the web-based application folder for saving the uploaded files. My instructor told me that I still have a security hole with these following lines in the code-behind of asp.net UploadFile control and I don't know why!!!
string path = #"~\Images\";
string comPath = Server.MapPath(path + "\\" + FileUpload1.FileName);
Could you please tell me how to prevent this kind of security hole?
UPDATE:
Could anyone tell me how to avoid this kind of security hole? I am still trying to find something to make these two lines secure.
My instructor told me that I still have a security hole with these following lines in the code-behind of asp.net UploadFile control and I don't know why!!!
Imagine what would happen if the client send something like ..\..\foo.jpg as file name.
A fairly simple fix is to make sure the file name has no \ or / characters in it, by stripping away everything up to the first such character. (This is a good idea anyway, since browsers vary in their treatment of upload file names: some will send the full path, some just the plain file name and some may even send a fake path instead.)
The easy way to do this is with a Regex:
string fileName = Regex.Replace( FileUpload1.FileName, #"(?s)^.*[\\/]", "" );
Note that, depending on your OS and file system, there are other characters that could potentially cause problems as well. The safest option, in general, is to strip away all characters except those you know are safe, something like this:
string safeFileName = Regex.Replace( fileName, #"[^A-Za-z0-9\-._() ]", "" );
You may also want to further ensure that the file name has an extension that matches the type you expect the file to have.

Split path by "\\" in C#

How can I split a path by "\\"? It gives me a syntax error if I use
path.split("\\");
You should be using
path.Split(Path.DirectorySeparatorChar);
if you're trying to split a file path based on the native path separator.
Try path.Split('\\') --- so single quote (for character)
To use a string this works:
path.Split(new[] {"\\"}, StringSplitOptions.None)
To use a string you have to specify an array of strings. I never did get why :)
There's no string.Split overload which takes a string. (Also, C# is case-sensitive, so you need Split rather than split). However, you can use:
string bits = path.Split('\\');
which will use the overload taking a params char[] parameter. It's equivalent to:
string bits = path.Split(new char[] { '\\' });
That's assuming you definitely want to split by backslashes. You may want to split by the directory separator for the operating system you're running on, in which case Path.DirectorySeparatorChar would probably be the right approach... it will be / on Unix and \ on Windows. On the other hand, that wouldn't help you if you were trying to parse a Windows file system path in an ASP.NET page running on Unix. In other words, it depends on your context :)
Another alternative is to use the methods on Path and DirectoryInfo to get information about paths in more file-system-sensitive ways.
To be on the safe side, you could use:
path.Split(new[] { Path.DirectorySeparatorChar, Path.AltDirectorySeparatorChar });
On windows, using forward slashes is also accepted, in C# Path functions and on the command line, in Windows 7/XP at least.
e.g.:
Both of these produce the same results for me:
dir "C:/Python33/Lib/xml"
dir "C:\Python33\Lib\xml"
(In C:)
dir "Python33/Lib/xml"
dir "Python33\Lib\xml"
On windows, neither '/' or '\' are valid chars for filename. On Linux, '\' is ok in filenames, so you should be aware of this if parsing for both.
So if you wanted to support paths in both forms (like I do) you could do:
path.Split(new char[] {'/', '\\'});
On Linux it would probably be safer to use Path.DirectorySeparatorChar.
Path.Split(new char[] { '\\\' });
Better just use the existing class System.IO.Path, so you don't need to care for any system specifications.
It provides methods to access any part of a file path like GetFileName(string path) etc.
A complete solution could look like this:
//
private static readonly char[] pathSeps = new char[] {
Path.DirectorySeparatorChar,
Path.AltDirectorySeparatorChar,
Path.VolumeSeparatorChar,
};
//
///<summary>Split a path according to the file system rules.</summary>
public static string[] SplitPath( string path ) {
if ( null == path ) return null;
return path.Split( pathSeps, StringSplitOptions.RemoveEmptyEntries );
}
Some of the other proposed solutions in this article use the syntax:
path.Split(new char[] {'/', '\'});
Although this will work, it has various disadvantages:
It does not allow your application to adapt to various target platforms. Currently, our applications are basically running on UNIX and Windows OSs (Win, macOS, iOS, linux variations). So there is a fixed set of path characters. But this might change when dotNET were ported to other operating systems. So it is best to use the predefined constants.
Performance of the inline syntax is worse. This might not be of interest for a handful of files, but when working with millions of files there are noticeable differences. The managed memory will go up until next GC. When looking at the generated assembly code you will find "call CORINFO_HELP_NEWARR_1_VC" for each of the 'new' statements, even in Release mode. This happens whenever you new-up any array, because arrays are not immutable. My proposed solution prevents this by declaring the array as readonly and static.
Reusability of the inline syntax also is worse, because you might want to use the path separators array in other contexts.
StringSplitOptions.RemoveEmptyEntries should be used to account for UNC paths and possible typos within the incoming path. The operating systems do not allow duplicate path separators, but there might be a typo from the user or a duplicate concatenation of path separator characters, for example when concatenating the path and filename.

Validating File Path w/Spaces in C#

I'm something of a n00b at C# and I'm having trouble finding an answer to this, so if it's already been answered somewhere feel free to laugh at me (provided you also share the solution). :)
I'm reading an XML file into a GUI form, where certain elements are paths to files that are entered into TextBox objects. I'm looping through the controls on the form, and for each file path in each TextBox (lol there's like 20 of them on this form), I want to use File.Exists() to ensure it's a valid file.
The problem with this is that the file path can potentially contain spaces, and can potentially be valid; however File.Exists() is telling me it's invalid, based entirely on the spaces. Obviously I can't hard-code them and do something like
if (File.Exists(#"c:\Path To Stuff"))
and I tried surrounding the path with ", like
if (File.Exists("\"" + contentsOfTextBox + "\""))
but that didn't make a difference. Is there some way to do this? Can I escape the spaces somehow?
Thank you for your time. :)
File.Exists works just fine with spaces. There is something else giving you a problem I'll wager.
Make sure your XML reader isn't failing to read the filename (parts of XML do not allow spaces and some readers will throw an exception if they encounter one).
#"c:\Path To Stuff"
The above could be a directory not a file!
Hence you would want to use Directory.Exists!
#"c:\Path To Stuff\file.txt"
If you did have a file on the end of the path then you would use File.Exists!
As the answer said, File.Exists works with spaces, if you are checking for existence of a Directory however, you should be using Directory.Exists
What is the exact error that you get when File.Exists says it is invalid?
I suspect that you are passing a path to a directory and not a file, which will return false. If so, to check the presence of a directory, use Directory.Exists.
To echo Ron Warholic: make sure the process has permissions over the target folder. I just ran into the same "bug" and it turned out to be a permissions issue.
Did you remember to replace \ with \\ ?
You need to use youtStringValue.Trim() to remove spaces leading/trailing, and Replace to remove spaces in the string you do not want.
Also, rather use System.IO.Path.Combine rather to combine these strings.
You can use # on string variables:
string aPath = "c:\Path To Stuff\text.txt";
File.Exists(#aPath);
That should solve any escape character problems because I don't think this really looks like the spaces being the problem.
hi this is not difficult if you can convert the name of the path to a string array then go through one by one and remove the spaces
once that is done just write() to the screen where you have the files, if it is xml then your xmlmapper will suffice
file.exists() should only be used in certain circumstances if you know that it does exist but not when there can be space chars or any other possible user input

File paths with non-ascii characters and FileInfo in C#

I get a string that more or less looks like this:
"C:\\bláh\\bleh"
I make a FileInfo with it, but when I check for its existence it returns false:
var file = new FileInfo(path);
file.Exists;
If I manually rename the path to
"C:\\blah\\bleh"
at debug time and ensure that blah exists with a bleh inside it, then file.Exists starts returning true. So I believe the problem is the non-ascii character.
The actual string is built by my program. One part comes from the AppDomain of the application, which is the part that contains the "á", the other part comes, in a way, from the user. Both parts are put together by Path.Combine. I confirmed the validity of the resulting string in two ways: copying it from the error my program generates, which includes the path, into explorer opens the file just fine. Looking at that string at the debugger, it looks correctly escaped, in that \ are written as \. The "á" is printed literarily by the debugger.
How should I process a string so that even if it has non-ascii characters it turns out to be a valid path?
Here is a method that will handle diacritics in filenames. The success of the File.Exists method depends on how your system stores the filename.
public bool FileExists(string sPath)
{
//Checking for composed and decomposed is to handle diacritics in filenames.
var pathComposed = sPath.Normalize(NormalizationForm.FormC);
if (File.Exists(pathComposed))
return true;
//We really need to check both possibilities.
var pathDecomposed = sPath.Normalize(NormalizationForm.FormD);
if (File.Exists(pathDecomposed))
return true;
return false;
}
try this
string sourceFile = #"C:\bláh\bleh";
if (File.Exists(sourceFile))
{
Console.WriteLine("file exist.");
}
else
{
Console.WriteLine("file does not exist.");
}
Note : The Exists method should not be used for path validation, this method merely checks if the file specified in path exists. Passing an invalid path to Exists returns false.
For path validation you can use Directory.Exists.
I have just manuall created a bláh folder containing a bleh file, and with that in place, this code prints True as expected:
using System;
using System.IO;
namespace ConsoleApplication72
{
class Program
{
static void Main(string[] args)
{
string filename = "c:\\bláh\\bleh";
FileInfo fi = new FileInfo(filename);
Console.WriteLine(fi.Exists);
Console.ReadLine();
}
}
}
I would suggest checking the source of your string - in particular, although your 3k rep speaks against this being the problem, remember that expressing a backslash as \\ is an artifact of C# syntax, and you want to make sure your string actually contains only single \s.
Referring to #adatapost's reply, the list of invalid file name characters (gleaned from System.IO.Path.GetInvalidFileNameChars() in fact doesn't contain normal characters with diacritics.
It looks like the question you're really asking is, "How do I remove diacritics from a string (or in this case, file path)?".
Or maybe you aren't asking this question, and you genuinely want to find a file with name:
c:\blòh\bleh
(or something similar). In that case, you then need to try to open a file with the same name, and not c:\bloh\bleh.
Look like the "bleh" in the path is a directory, not a file. To check if the folder exist use Directory.Exists method.
The problem was: the program didn't have enough permissions to access that file. Fixing the permissions fixed the problem. It seems that when I didn't my experiment I somehow managed to reproduce the permission problem, possibly by creating the folder without the non-ascii character by hand and copying the other one.
Oh... so embarrassing.

Categories

Resources