Create CSV file from c#: extra character in excel - c#

I create a CSV file from a C# application but the characters  are displayed in Excel and OpenOfficeCalc in the first cell but not in Notepad and Notepad++.
Here is my code:
StreamWriter streamWriter = new StreamWriter(new FileStream(filePath, FileMode.Create), Encoding.UTF8);
List<MyData> myData = GetMyData;
foreach(MyData md in myData)
{
streamWriter.WriteLine(md.Date + "," + md.Data1 + "," + md.Data2 + "," + md.Data3 + "," + md.Data4);
}
streamWriter.Flush();
streamWriter.Close();
MyData is
public struct MyData
{
public float Data1;
public float Data2;
public float Data3;
public float Data4;
public DateTime Date;
}
Here is the result in Notepad and Notepad++:
01/12/2010 00:04:00,0.08,78787.4,9.1,5
01/12/2010 00:09:00,0.07,78787.42,9.1,5
01/12/2010 00:14:00,0.06,78787.44,9.1,5
01/12/2010 00:19:00,1.45,78787.58,9.1,5
01/12/2010 00:24:00,2.13,78788.15,9.1,5
01/12/2010 00:29:00,1.72,78788.53,9,5
01/12/2010 00:34:00,0.89,78788.73,9,5
And in Excel and Calc:
01/12/2010 00:04:00 0.08 78787.4 9.1 5
01/12/10 00:09 0.07 78787.42 9.1 5
01/12/10 00:14 0.06 78787.44 9.1 5
01/12/10 00:19 1.45 78787.58 9.1 5
01/12/10 00:24 2.13 78788.15 9.1 5
01/12/10 00:29 1.72 78788.53 9 5
01/12/10 00:34 0.89 78788.73 9 5
Those 3 characters appears only once at the start of the file and then everything is as it should be.
My question is:
Where does  come from and how to remove it?
I have tried to write my output in a StringBuilder and to debug to see its content and it does not have those character.

This is the BOM for UTF8 encoding
Look at this question: Write text files without Byte Order Mark (BOM)?

You may want to specify the encoding type based on your input.
StreamWriter.Encoding Property
http://msdn.microsoft.com/en-us/library/system.io.streamwriter.encoding.aspx

I think that is BOM (unicode header). Specify ANSI encoding in constructor of StreamWriter. It is simple: Just set it to Encoding.Default.
update:
If you must use UTF-8 and need to get rid of those 3 bytes at the beginning, use this: new UTF8Encoding(false). This way the writer will stay in UTF8, but won't write BOM.

Try to specify Encoding.ASCII instead of using Encoding.UTF8 when you create the StreamWriter

Related

Write String to File With Absolutely Zero Changes

I am using Adobe's EchoSign API to retrieve a string representation of a PDF file. The problem I am running into is that writing the file to disk is working properly. The file length is a much different length than the string and won't open as a PDF.
As a test, I used an existing PDF file - one that I know is a true PDF, and tried to pull the contents of the file as a string like their API provides and then write it back to another file. The result is the same. I can open the "real" PDF using Adobe, but the new file will not open. This should be simple, but I am obviously missing something.
Here is what I have done to test this out:
Scenario 1: Using string received from the API
File.WriteAllText(fileName, PDFstring, new UTF8Encoding(false));
Scenario 2: Using string received from the API. Yeah, it seemed dumb, but nothing has been working.
using (var sw = File.CreateText(fileName))
{
for (int p = 0; p < PDFstring.Length; p++)
{
var c = PDFstring.Substring(p, 1);
sw.Write(c);
}
}
Scenario 3: Use a known good PDF file and try to copy it by creating a string and writing it to a new file.
var filename = #"C:\Adobe\GoodDocument.pdf";
var newFile = #"C:\Adobe\Rewrite.pdf";
var fs = new FileStream(filename, FileMode.Open, FileAccess.Read);
var file = new StreamReader(fs);
var allAdobe = file.ReadToEnd();
fs.Close();
File.WriteAllText(newFile, allAdobe, new UTF8Encoding(false));
All three scenarios gave the same results. I cannot use the new file. The file lengths are all longer than they should be. Attempting to open the new file asks for a password where the original does not.
Obeservation: I just ran scenario 3 again. Accept this time using the copied (incorrect) file as the original. The result was an exact duplicate! What gives? Is Adobe playing tricks with me?
The #hans-kilian answer is enought if you won't edit something before rewrite the document, but i think you can read it a string changing Reading format and Writing format to ASCII:
var filename = #"C:\Adobe\GoodDocument.pdf";
var newFile = #"C:\Adobe\Rewrite.pdf";
var fs = new FileStream(filename, FileMode.Open, FileAccess.Read);
var file = new StreamReader(fs, System.Text.Encoding.Default);
var allAdobe = file.ReadToEnd();
fs.Close();
File.WriteAllText(newFile, allAdobe, System.Text.Encoding.Default);
EDIT: I realize only now that your string come from an API, so that's the only viable solution :)
EDIT2: Ok, i read your link and i understand that you need to decode in base 64 some chunks of your PDF strings, and i think is what i was saying you in my yesterday comment:
I open a "test.pdf" with notepad++ and i've that piece of code:
%PDF-1.7
4 0 obj
(Identity)
endobj
5 0 obj
(Adobe)
endobj
8 0 obj
<<
/Filter /FlateDecode
/Length 146861
/Type /Stream
>>
stream
[.......] LOTS OF ANSI CHARACTERS [.......]
endstream
endobj
13 0 obj
<<
/Font <<
/F1 11 0 R
>>
>>
endobj
3 0 obj
<<
/Contents [ 12 0 R ]
/CropBox [ 0.0 0.0 595.32001 841.92004 ]
/MediaBox [ 0.0 0.0 595.32001 841.92004 ]
/Parent 2 0 R
/Resources 13 0 R
/Rotate 0
/Type /Page
>>
endobj
10 0 obj
<<
/Length 535
>>
stream
/CIDInit /ProcSet findresource begin 12 dict begin begincmap /CIDSystemInfo << /Registry (Adobe) /Ordering (UCS) /Supplement 0 >> def /CMapName /Adobe-Identity-UCS def /CMapType 2 def 1 begincodespacerange <0000> <FFFF> endcodespacerange 15 beginbfchar <0003> <0020> <0018> <0044> <0026> <0046> <002C> <0048> <0057> <0050> <0102> <0061> <011E> <0065> <015D> <0069> <0175> <006D> <0190> <0073> <019A> <0074> <01C7> <0079> <0355> <002C> <0357> <003A> <035B> <2019> endbfchar endcmap CMapName currentdict /CMap defineresource pop end end
endstream
endobj
9 0 obj
[ 3 3 226 24 24 615 38 38 459 44 44 623 87 87 516 258 258 479 286 286 497 349 349 229 373 373 798 400 400 391 410 410 334 455 455 452 853 853 249 855 855 267 859 859 249 ]
endobj
6 0 obj
[ -798 -268 798 952 ]
endobj
7 0 obj
798
endobj
2 0 obj
<<
/Count 1
/Kids [ 3 0 R ]
/Type /Pages
>>
endobj
1 0 obj
<<
/Pages 2 0 R
/Type /Catalog
>>
endobj
14 0 obj
<<
/Author (user)
/CreationDate (D:20180713094854+02'00')
/ModDate (D:20180713094854+02'00')
/Producer (Microsoft: Print To PDF)
/Title (Microsoft Word - Documento1)
>>
endobj
xref
0 15
0000000000 65535 f
0000148893 00000 n
0000148834 00000 n
0000147825 00000 n
0000000009 00000 n
0000000035 00000 n
0000148778 00000 n
0000148815 00000 n
0000000058 00000 n
0000148591 00000 n
0000148004 00000 n
0000147008 00000 n
0000147480 00000 n
0000147780 00000 n
0000148942 00000 n
trailer
<<
/Info 14 0 R
/Root 1 0 R
/Size 15
>>
startxref
149133
%%EOF
(i use code snippet just to have the code correctly formatted ;) )
What i've inside [.......] LOTS OF ANSI CHARACTERS [.......] is ANSI but in your situation is a base64string that need to be "replaced" with his base64 decoded to ANSI string, if i'm right you can do that like below:
byte[] data = Convert.FromBase64String(your_base_64_string);
string decodedString = Encoding.Default.GetString(data);
Let me know if you can hit the goal :)
PDF is a binary format. So you need to read and write them as bytes like this:
var document = File.ReadAllBytes("document.pdf");
File.WriteAllBytes("new document.pdf", document);
While Legion technically answered the posed question, I feel it's necessary for anyone following in my footsteps to get the full answer.
What lead to this question was me trying to write the content of a response to an Adobe Sign API call to a file.
I am using C# and the RestSharp library. This is important. The RestSharp IRestResponse object that provides the content apparently creates this property from the data received from the call. Because the content is so complex, creating the string representation immediately made writing it to a PDF file impossible. Digging deeper into the response object, I noticed a property call RawBytes. This is a byte array of the response. If I write the byte array directly to disk, everything.just.works.
Sorry to bother everyone with this. I was one layer above the actual problem

How do I round trip an entitized carriage return with XDocument?

Suppose I have this XML document:
<x xml:space='preserve'>
</x>
with this sequence of bytes as the content of the <x/>:
38 35 120 100 59 13 10
My understanding from the W3C spec is that the sequence 13 10 will be replaced before parsing. To get the sequence 13 10 to show up in my parsed tree, I have to include the character entity &xd; as clarified in a note in the W3C spec (I recognize these are from XML-1.1 instead of XML-1.0, but they clarify confusing things in XML-1.0 without describing a different behavior).
As explained in 2.11 End-of-Line Handling, all #xD characters literally present in an XML document are either removed or replaced by #xA characters before any other processing is done. The only way to get a #xD character to match this production is to use a character reference in an entity value literal.
With XDocument.Parse, this all seems to work correctly. The text content for the above XML is 13 10 (rather than 13 13 10), suggesting that the character entity is preserved and the literal 13 10 is replaced with 10 prior to parsing.
However, I can’t figure out how to get XDocument.ToString() to entitize newlines when serializing. I.e., I’d expect (XDocument xd) => XDocument.Parse($"{xd}") to be a lossless function. But if I pass in an XDocument instance with 13 10 as text content, that function outputs an XDocument instance with 10 as text content. See this demonstration:
var x = XDocument.Parse("<x xml:space='preserve'>
\r\n</x>");
present("content", x.Root.Value); // 13 10, expected
present("formatted", $"{x}"); // inside <x/>: 13 10, unexpected
x = XDocument.Parse($"{x}");
present("round tripped", x.Root.Value); // 10, unexpected
// Note that when formatting the version with just 10 in the value,
// we get Environment.NewLine in the formatted XML. So there is no
// way to differentiate between 10 and 13 10 with XDocument because
// it normalizes when serializing.
present("round tripped formatted", $"{x}"); // inside <x/>: 13 10, expected
void present(string label, string thing)
{
Console.WriteLine(label);
Console.WriteLine(thing);
Console.WriteLine(string.Join(" ", Encoding.UTF8.GetBytes(thing)));
Console.WriteLine();
}
You can see that when XDocument is serialized, it fails to entitize the carriage return as either 
 or
. The result is that it loses information. How can I safely encode an XDocument so that I do not lose anything, particularly carriage returns, that were in the original document I loaded?
To round-trip XDocument, do not use the recommended/easy serialization methods such as XDocument.ToString() because this is lossy. Note also that, even if you do something like xd.ToString(SaveOptions.DisableFormatting), any carriage returns in the parsed tree will be lost.
Instead, use a properly-configured XmlWriter with XDocument.WriteTo. If using an XmlWriter, the XmlWriter will be able to see that the document contained literal carriage returns and encode them correctly. To instruct it to do so, set XmlWritterSettings.NewLineHandling to NewLineHandling.Entitize. You’ll probably want to write an extension method to make this easier to reuse.
The demo altered to use this approach is below:
var x = XDocument.Parse("<x xml:space='preserve'>
\r\n</x>");
present("content", x.Root.Value); // 13 10, expected
present("formatted", toString(x)); // inside <x/>: 38 35 120 68 59 10 ("
\n"), acceptable
x = XDocument.Parse(toString(x));
present("round tripped", x.Root.Value); // 13 10, expected
string toString(XDocument xd)
{
using var sw = new StringWriter();
using (var writer = XmlWriter.Create(sw, new XmlWriterSettings
{
NewLineHandling = NewLineHandling.Entitize,
}))
{
xd.WriteTo(writer);
}
return sw.ToString();
}
void present(string label, string thing)
{
Console.WriteLine(label);
Console.WriteLine(thing);
Console.WriteLine(string.Join(" ", Encoding.UTF8.GetBytes(thing)));
Console.WriteLine();
}

Split a string into lines?

Here is code;
foreach (var file in d.GetFiles("*.xml"))
{
string test = getValuesOneFile(file.ToString());
result.Add(test);
Console.WriteLine(test);
Console.ReadLine();
}
File.WriteAllLines(filepath + #"\MapData.txt", result);
Here is what it looks like in the console;
[30000]
total=5
sp 0 -144 152 999999999
sp 0 -207 123 999999999
sp 0 -173 125 999999999
in00 1 -184 213 999999999
out00 2 1046 94 40000
Here is how it looks like in the text file (when written at end of loop).
[30000]total=5sp 0 -144 152 999999999sp 0 -207 123 999999999sp 0 -173 125 999999999in00 1 -184 213 999999999out00 2 1046 94 40000
I need it to write the lines in the same style as the console output.
WriteAllLines is going to separate each of the values with the environments new line string, however, throughout the history of computers a number of possible different characters have been used to represent new lines. You are looking at the text file using some program that is expecting a different type of new line separator. You should either be using a different program to look at the value of that file; one that either properly handles this type of separator (or can handle any type of separator), you should be configuring your program to expect the given type of separator, or you'll need to replace WriteAllLines with a manual method of writing the strings that uses another new line separator.
Rather than WriteAllLines You'll probably want to just write the text manually:
string textToWrite = "";
foreach (var res in result)
{
textToWrite += res.Replace("\r","").Replace("\n",""); //Ensure there are no line feeds or carriage returns
textToWrite += "\r\n"; //Add the Carriage Return
}
File.WriteAllText(filepath + #"\MapData.txt", textToWrite)
The problem is definitely how you are looking for newlines in your output. Environment.NewLine will get inserted after each string written by WriteAllLines.
I would recommend opening the output file in NotePad++ and turn on View-> ShowSymbol-> Show End of Line to see what end of line characters are in the file. On my machine for instance it is [CR][LF] (Carriage Return / Line Feed) at the end of each line which is standard for windows.

Stream, string and null character

I have a stream which contains several \0 inside it. I have to replace textual parts of this stream, but when I do
StreamReader reader = new StreamReader(stream);
string text = reader.ReadToEnd();
text only contains the beginning of the stream (because of the \0 character). So
text = text.Replace(search, replace);
StreamWriter writer = new StreamWriter(stream);
writer.Write(text);
will not do the expected job since I don't parse the "full" stream. Any idea on how to get access to the full data and replace some textual parts ?
EDIT : An example of what I see on notepad
stream
H‰­—[oã6…ÿÛe)Rêq%ÙrlËñE±“-úàÝE[,’íKÿþŽDjxÉ6ŒÅ"XkÏáGqF að÷óð!SN>¿¿‰È†/$ËÙpñ<^HVÀHuñ'¹¿à»U?`äŸ?
¾fØø(Ç,ükøéàâ+ùõ7øø2ÜTJ«¶Ïäd×SÿgªŸF_ß8ÜU#<Q¨|œp6åâ-ªÕ]³®7Ûn¹ÚÝ|‰,¨¹^ãI©…Ë<UIÐI‡Û©* Ǽ,,ý¬5O->qä›Ü
endstream 
endobj
8 0 obj
<<
/Type /FontDescriptor
/FontName /Verdana
/Ascent 765
/Descent -207
/CapHeight 1489
/Flags 32
/ItalicAngle 0
/StemV 86
/StemH 0
/FontBBox [ -560 -303 1523 1051 ]
/FontFile2 31 0 R
>>
endobj
9 0 obj
And I want to replace /FontName /Verdana by /FontName /Arial on the fly, for example.
Ah, now we're getting to it...
This file a pdf
Then it's not a text file. That's a binary file, and should be treated as a binary file. Using StreamReader on it will lose data. You'll need to use a different API to access the data in it - one which understands the PDF format. Have a look at iTextSharp or PDFTron.
I can't duplicate your results. The code below creates a string with a \0 in it, writes to file, and then reads it back. The resulting string has the \0 in it:
string s = "hello\x0world";
File.WriteAllText("foo.txt", s);
string t;
using (var f = new StreamReader("foo.txt"))
{
t = f.ReadToEnd();
}
Console.WriteLine(t == s); // prints "True"
I get the same results if I do var t = File.ReadAllText("foo.txt");

Read .dat file in c#

i want to read a .dat file that have following conditions:-
First name offset 21
First name format ASCIIz 15 chars + \0
Middle initials offset 37
ID offset -8
ID format/length Unsigned int (4 bytes)
so help me for sorting this issue in c#.
Thanks in advance.
Gurpreet
.dat file
( ÿ / rE ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ XÙþÞ¦d e e Mr. Sam Ascott Sam 9209 Sandpiper Lane 21204 410 5558987 410 5556700 275 MM229399098 (¬ Þ e ܤ•Þ„ œÔ£ÝáØáØ ’Þ[Þ €–˜ ä–˜ [Þ ¶ Norman Eaton Friend of Dr. Shultz Removal of #1,16,17 & 32 öÜÝ)Ý Ä d 01 21 21 21 e 101 22099 XÙþÞ¦d e . Mrs. Patty Baxter Patty 3838 Tommytrue Court 21234 410 2929290 410 3929209 FM218798127 HAY FEVER Þ . „¤¢Þè   _ÐÍÝBÒBÒ ’ÞÝ €–˜ ä–˜ ÍÝ f Joanne Abbey
Here is a tutorial how to use BinaryReader for this purpose:
http://dotnetperls.com/binaryreader
You can use Jet OleDB to query .dat files:
var query = "select * from file.dat";
var connection = new OleDbConnection("Provider=Microsoft.Jet.OLEDB.4.0;Data Source=C:\\file.dat;Extended Properties=\"text;HDR=NO;FMT=FixedLength\"");
See this link:
Code Project. Read Text File (txt, csv, log, tab, fixed length)
And check these:
Reading a sequential-access file
DAT files in C#
Read a file in C#
BinaryReader Class
Jon Skeet. Reading binary data in C#
As said, have a look at the BinaryReader
//Example...
BinaryReader reader = new BinaryReader(stream);
string name = Encoding.ASCII.GetString(reader.ReadBytes(8));
int number = reader.ReadInt32();
my .dat file content - " €U§µ­PÕ „ÕG¬u "
click here to see the content of my .dat file in Notepad and hexa editor
string fileName = #"W:\yourfilename.dat";
//Read the binary file as byte array
byte[] bHex = File.ReadAllBytes(fileName);
//Create string builder for extracting the HEX values
StringBuilder st = new StringBuilder();
//initialize the int for 0
int i = 0;
// check it worked
//Reverse the HEX array for readability
foreach (char c in bHex.Reverse())
{
i++;
// 12 to 21 byte in the reverse order for interseted value in ticks"
if (i > 12 && i < 21)
st.Append(Convert.ToInt32(c).ToString("X2"));
}
// Convert HEX to Deciamal
long Output = Convert.ToInt64(st.ToString(), 16);
//Convert ticks to date time
DateTime dt = new DateTime(Output);
//Write the output date to console
Console.Write(dt);
Final de-crypt binary content to data-time.
final output of the program

Categories

Resources