How to prepend a header in a text file - c#

How to prepend/append text beginning of the existing data in a text file.
Basically i need to provide a header before this data in a text file. This header is a dynamic data. Please note this data is coming from external source or SQL package or from somewhere. So After getting data in a text file then i want to provide a header text with comma separated in the existing entries/data of a text file.
I've sample data in a text file as below:
123,"SAV","CBS123",2010-10-10 00:00:00
456,"CUR","CBS456",2012-02-01 00:00:00
Header text to Prepend:
HDR<TableName><DateTime>
Output i need as below:
TableName: Account
DateTime: 2012-05-09 12:52:00
HDRAccount2012-05-09 12:52:00
123,"SAV","CBS123",2010-10-10 00:00:00
456,"CUR","CBS456",2012-02-01 00:00:00
Please help me how to get the same in both languages VB6.0, C#.NET

Note that you can't technically 'insert' into a file and have all contents 'shift' down. Best you can do is read the file and rewrite it with a new line. Here's one way to do it efficiently:
static void InsertHeader(string filename, string header)
{
var tempfile = Path.GetTempFileName();
using (var writer = new StreamWriter(tempfile))
using (var reader = new StreamReader(filename))
{
writer.WriteLine(header);
while (!reader.EndOfStream)
writer.WriteLine(reader.ReadLine());
}
File.Copy(tempfile, filename, true);
File.Delete(tempfile);
}
Credits to this answer for the idea but improved enough to make it worth posting separately.
Now if you want something that accepts the table name and date time, just add this as a second function:
static void InsertTableHeader(string filename, string tableName, DateTime dateTime)
{
InsertHeader(filename,
String.Format("HDR{0}{1:yyyy-MM-dd HH:MM:ss}",
tableName,
dateTime));
}
So just call InsertHeader(filename, "Account", DateTime.Now) or similar as needed.

var fn = #"c:\temp\log.csv";
var hdr1 = "Account";
var hdr2 = "2012-05-09 12:52:00";
System.IO.File.WriteAllText(fn, System.String.Format("HDR {0} {1}\n{2}", hdr1, hdr2, System.IO.File.ReadAllText(fn)))

String[] headerLines = new String[]{"HDR<TableName><DateTime>"};
String filename = "1.txt";
var newContent = headerLines.Union(File.ReadAllLines(filename));
File.WriteAllLines(filename, newContent);

VB6 translation of yamen's answer. Air code! I haven't compiled this, much less run
it!
Sub InsertHeader(ByVal filename As String, ByVal header As String)
Dim tempfile As String
Dim readUnit As Integer
Dim writeUnit As Integer
tempfile = "c:\tempfile" '' TODO generate better temporary filename -
'' here is a link to help with getting path of temporary directory
'' http://vb.mvps.org/samples/SysFolders
readUnit = FreeFile
Open filename For Input As #readUnit
writeUnit = FreeFile
Open tempfile For Output As #writeUnit
Print #writeUnit, header
Do Until Eof(readUnit)
Dim nextLine As String
Line Input #readUnit, nextLine
Print #writeUnit, nextLine
Loop
Close readUnit
Close writeUnit
Kill filename
FileCopy tempfile, filename
Kill tempfile
End sub

You can do it in the reverse order of the 1st answere, meanse first your write the header in text file then open that text file in append mode and then woirite the data ..for opening the file in append mode use following code line:
FileStream aFile = new FileStream(filePath, FileMode.Append,
FileAccess.Write);
StreamWriter sw = new StreamWriter(aFile);
sw.Write(text);
sw.Close();
aFile.Close();

Related

Replacing Invalid XML characters from an excel file and writing it back to disk causes file is corrupted error on opening in MS Excel

A little background on problem:
We have an ASP.NET MVC5 Application where we use FlexMonster to show the data in grid. The data source is a stored procedure that brings all the data into the UI grid, and once user clicks on export button, it exports the report to Excel. However, in some cases export to excel is failing.
Some of the data has some invalid characters, and it is not possible/feasible to fix the source as suggested here
My approach so far:
EPPlus library fails on initializing the workbook as the input excel file contains some invalid XML characters. I could find that the file is dumped with some invalid character in it. I looked into the possible approaches .
Firstly, I identified the problematic character in the excel file. I first tried to replace the invalid character with blank space manually using Notepad++ and the EPPlus could successfully read the file.
Now using the approaches given in other SO thread here and here, I replaced all possible occurrences of invalid chars. I am using at the moment
XmlConvert.IsXmlChar
method to find out the problematic XML character and replacing with blank space.
I created a sample program where I am trying to work on the problematic excel sheet.
//in main method
String readFile = File.ReadAllText(filePath);
string content = RemoveInvalidXmlChars(readFile);
File.WriteAllText(filePath, content);
//removal of invalid characters
static string RemoveInvalidXmlChars(string inputText)
{
StringBuilder withoutInvalidXmlCharsBuilder = new StringBuilder();
int firstOccurenceOfRealData = inputText.IndexOf("<t>");
int lastOccurenceOfRealData = inputText.LastIndexOf("</t>");
if (firstOccurenceOfRealData < 0 ||
lastOccurenceOfRealData < 0 ||
firstOccurenceOfRealData > lastOccurenceOfRealData)
return inputText;
withoutInvalidXmlCharsBuilder.Append(inputText.Substring(0, firstOccurenceOfRealData));
int remaining = lastOccurenceOfRealData - firstOccurenceOfRealData;
string textToCheckFor = inputText.Substring(firstOccurenceOfRealData, remaining);
foreach (char c in textToCheckFor)
{
withoutInvalidXmlCharsBuilder.Append((XmlConvert.IsXmlChar(c)) ? c : ' ');
}
withoutInvalidXmlCharsBuilder.Append(inputText.Substring(lastOccurenceOfRealData));
return withoutInvalidXmlCharsBuilder.ToString();
}
If I replaces the problematic character manually using notepad++, then the file opens fine in MSExcel. The above mentioned code successfully replaces the same invalid character and writes the content back to the file. However, when I try to open the excel file using MS Excel, it throws an error saying that file may have been corrupted and no content is displayed (snapshots below). Moreover, Following code
var excelPackage = new ExcelPackage(new FileInfo(filePath));
on the file that I updated via Notepad++, throws following exception
"CRC error: the file being extracted appears to be corrupted. Expected 0x7478AABE, Actual 0xE9191E00"}
My Questions:
Is my approach to modify content this way correct?
If yes, How can I write updated string to an Excel file?
If my approach is wrong then, How can I proceed to get rid of invalid XML chars?
Errors shown on opening file (without invalid XML char):
First Pop up
When I click on yes
Thanks in advance !
It does sounds like a binary (presumable XLSX) file based on your last comment. To confirm, open the file created by the FlexMonster with 7zip. If it opens properly and you see a bunch of XML files in folders, its a XLSX.
In that case, a search/replace on a binary file sounds like a very bad idea. It might work on the XML parts but might also replace legit chars in other parts. I think the better approach would be to do as #PanagiotisKanavos suggests and use ZipArchive. But you have to do rebuild it in the right order otherwise Excel complains. Similar to how it was done here https://stackoverflow.com/a/33312038/1324284, you could do something like this:
public static void ReplaceXmlString(this ZipArchive xlsxZip, FileInfo outFile, string oldString, string newstring)
{
using (var outStream = outFile.Open(FileMode.Create, FileAccess.ReadWrite))
using (var copiedzip = new ZipArchive(outStream, ZipArchiveMode.Update))
{
//Go though each file in the zip one by one and copy over to the new file - entries need to be in order
foreach (var entry in xlsxZip.Entries)
{
var newentry = copiedzip.CreateEntry(entry.FullName);
var newstream = newentry.Open();
var orgstream = entry.Open();
//Copy non-xml files over
if (!entry.Name.EndsWith(".xml"))
{
orgstream.CopyTo(newstream);
}
else
{
//Load the xml document to manipulate
var xdoc = new XmlDocument();
xdoc.Load(orgstream);
var xml = xdoc.OuterXml.Replace(oldString, newstring);
xdoc = new XmlDocument();
xdoc.LoadXml(xml);
xdoc.Save(newstream);
}
orgstream.Close();
newstream.Flush();
newstream.Close();
}
}
}
When it is used like this:
[TestMethod]
public void ReplaceXmlTest()
{
var datatable = new DataTable("tblData");
datatable.Columns.AddRange(new[]
{
new DataColumn("Col1", typeof (int)),
new DataColumn("Col2", typeof (int)),
new DataColumn("Col3", typeof (string))
});
for (var i = 0; i < 10; i++)
{
var row = datatable.NewRow();
row[0] = i;
row[1] = i * 10;
row[2] = i % 2 == 0 ? "ABCD" : "AXCD";
datatable.Rows.Add(row);
}
using (var pck = new ExcelPackage())
{
var workbook = pck.Workbook;
var worksheet = workbook.Worksheets.Add("source");
worksheet.Cells.LoadFromDataTable(datatable, true);
worksheet.Tables.Add(worksheet.Cells["A1:C11"], "Table1");
//Now similulate the copy/open of the excel file into a zip archive
using (var orginalzip = new ZipArchive(new MemoryStream(pck.GetAsByteArray()), ZipArchiveMode.Read))
{
var fi = new FileInfo(#"c:\temp\ReplaceXmlTest.xlsx");
if (fi.Exists)
fi.Delete();
orginalzip.ReplaceXmlString(fi, "AXCD", "REPLACED!!");
}
}
}
Gives this:
Just keep in mind that this is completely brute force. Anything you can do to make the file filter smarter rather then simply doing ALL xml files would be a very good thing. Maybe limit it to the SharedString.xml file if that is where the problem lies or in the xml files in the worksheet folders. Hard to say without knowing more about the data.

Reverting Replace in a text file

I have a template I'm using to print labels, what I'm currently doing is a Replace() on the variable parts of my template and print it as is.
What is the best way to recover the original template after printing ? Revert manually all the changes ? Not modifying the template at first but create a copy that I modify, print and delete ?
The template looks like :
data1 : $1
data2 : $2
data3 : $3
data4 : $4
and then Replace() + print with :
string text = File.ReadAllText(filePath);
text = text.Replace("$1", textBoxNumOF.Text);
text = text.Replace("$2", designation);
text = text.Replace("$3", textBoxNumOF.Text.Substring(textBoxNumOF.Text.Length - 4));
text = text.Replace("$4", "1");
File.WriteAllText(filePath, text, UTF8Encoding.UTF8);
PrintDialog pd1 = new PrintDialog();
pd1.PrinterSettings = new PrinterSettings();
EnvoiImpression.SendFileToPrinter(#"Datamax-O'Neil H-4310 (Copie 1)", filePath);
Read your template and write the output which you are sending to the printer into a temp file inside the temp directory of windows.
Please see the following function:
public static string GetTempFile()
{
// get temporary path
var tempPath = Path.GetTempPath();
// get temporary filename
string tempFileName = Path.GetRandomFileName();
//combine
return Path.Combine(tempPath, tempFileName);
}
This way you do not need to revert your template and comply with the rules for temporary files on Windows. I suggest that you remember the files for deleting all your temporary files again from disk after your program / method was successful.
The function
EnvoiImpression.SendFileToPrinter(#"Datamax-O'Neil H-4310 (Copie 1)", filePath);
is sadly unknown to my. But perhaps there is also an overload which does accept a Stream? If so, you could edit your template in a MemoryStream and do not even need to write to the disk.

Append Text at exact location in text

I was looking to append text to a exact location in a text file. I have used StreamReader to find the text in the file I am looking for. I thought about using StreamWriter but that obviously doesn't make sense. I was hoping to find some "append" method in some class somewhere that would help me do this but with now success. Or is there a better way to do this than to use StreamReader?
using (StreamReader sr = new StreamReader(fileName))
{
string line;
while ((line = sr.ReadLine()) != null)
{
if (line.Contains("VAR_GLOBAL CONSTANT"))
{
//append text before this variable
// e.g. (*VAR_GLOBAL CONSTANT
// append the (* before VAR_GLOBAL CONSTANT
}
if (line.Contains("END_VAR"))
{
//append text after this variable
// e.g. END_VAR*)
// append the *) after END_VAR
}
}
}
Does anyone have any thoughts on how to accomplish this?
One way to do it would be to read the file contents into a string, update the contents locally, and then write it back to the file again. This probably isn't very feasible for really large files, especially if the appending is done at the end, but it's a start:
var filePath = #"f:\public\temp\temp.txt";
var appendBeforeDelim = "VAR_GLOBAL CONSTANT";
var appendAfterDelim = "END_VAR";
var appendBeforeText = "Append this string before some text";
var appendAfterText = "Append this string after some text";
var newFileContents = File.ReadAllText(filePath)
.Replace(appendBeforeDelim, $"{appendBeforeText}{appendBeforeDelim}")
.Replace(appendAfterDelim, $"{appendAfterDelim}{appendAfterText}");
File.WriteAllText(filePath, newFileContents);

C# - Appending text files

I have code that reads a file and then converts it to a string, the string is then written to a new file, although could someone demonstrate how to append this string to the destination file (rather than overwriting it)
private static void Ignore()
{
System.IO.StreamReader myFile =
new System.IO.StreamReader("c:\\test.txt");
string myString = myFile.ReadToEnd();
myFile.Close();
Console.WriteLine(myString);
// Write the string to a file.
System.IO.StreamWriter file = new System.IO.StreamWriter("c:\\test2.txt");
file.WriteLine(myString);
file.Close();
}
If the file is small, you can read and write in two code lines.
var myString = File.ReadAllText("c:\\test.txt");
File.AppendAllText("c:\\test2.txt", myString);
If the file is huge, you can read and write line-by-line:
using (var source = new StreamReader("c:\\test.txt"))
using (var destination = File.AppendText("c:\\test2.txt"))
{
var line = source.ReadLine();
destination.WriteLine(line);
}
using(StreamWriter file = File.AppendText(#"c:\test2.txt"))
{
file.WriteLine(myString);
}
Use File.AppendAllText
File.AppendAllText("c:\\test2.txt", myString)
Also to read it, you can use File.ReadAllText to read it. Otherwise use a using statement to Dispose of the stream once you're done with the file.
Try
StreamWriter writer = File.AppendText("C:\\test.txt");
writer.WriteLine(mystring);

C# Streamreader writer (memory issues)

I have a few multimillion lined text files located in a directory, I want to read line by line and replace “|” with “\” and then write out the line to a new file. This code might work just fine but I’m not seeing any resulting text file, or it might be I’m just be impatient.
{
string startingdir = #"K:\qload";
string dest = #"K:\D\ho\jlg\load\dest";
string[] files = Directory.GetFiles(startingdir, "*.txt");
foreach (string file in files)
{
StringBuilder sb = new StringBuilder();
using (FileStream fs = new FileStream(file, FileMode.Open))
using (StreamReader rdr = new StreamReader(fs))
{
while (!rdr.EndOfStream)
{
string begdocfile = rdr.ReadLine();
string replacementwork = docfile.Replace("|", "\\");
sb.AppendLine(replacementwork);
FileInfo file_info = new FileInfo(file);
string outputfilename = file_info.Name;
using (FileStream fs2 = new FileStream(dest + outputfilename, FileMode.Append))
using (StreamWriter writer = new StreamWriter(fs2))
{
writer.WriteLine(replacementwork);
}
}
}
}
}
DUHHHHH Thanks to everyone.
Id10t error.
Get rid of the StringBuilder, and do not reopen the output file for each line:
string startingdir = #"K:\qload";
string dest = #"K:\D\ho\jlg\load\dest";
string[] files = Directory.GetFiles(startingdir, "*.txt");
foreach (string file in files)
{
var outfile = Path.Combine(dest, Path.GetFileName(file));
using (StreamReader reader = new StreamReader(file))
using (StreamWriter writer = new StreamWriter(outfile))
{
string line = reader.ReadLine();
while (line != null)
{
writer.WriteLine(line.Replace("|", "\\"));
line = reader.ReadLine();
}
}
}
Why are you using a StringBuilder - you are just filling up your memory without doing anything with it.
You should also move the FileStream and StreamWriter using statements to outside of your loop - you are re-creating your output streams for every line, causing unneeded IO in the form of opening and closing the file.
Use Path.Combine(dest, outputfilename), from your code it looks like you're writing to the file K:\D\ho\jlg\load\destouputfilename.txt
This code might work just fine but I’m not seeing any resulting text file, or it might be I’m just be impatient.
Have you considered having a Console.WriteLine in there to check the progress. Sure, it's going to slow down performance a tiny tiny bit - but you'll know what's going on.
It looks like you might want to do a Path.Combine, so that instead of new FileStream(dest + outputfilename), you have new FileStream(Path.Combine(dest + outputfilename)), which will create the files in the directory that you expect, rather than creating them in K:\D\ho\jlg\load.
However, I'm not sure why you're writing to a StringBuilder that you're not using, or why you're opening and closing the file stream and stream writer on each line that you're writing, is that to force the writer to flush it's output? If so, it might be easier to just flush the writer/stream on each write.
you're opening and closing the output strean for each line in the output, you'll have to be very patient!
open it once outside the loop.
I guess the problem is here:
string begdocfile = rdr.ReadLine();
string replacementwork = docfile.Replace("|", "\\");
you're reading into begdocfile variable but replacing chars in docfile which I guess is empty
string replacementwork = docfile.Replace("|", "\\");
I believe the above line in your code is incorrect : it should be "begdocfile.Replace ..." ?
I suggest you focus on getting as much of the declaration and "name manufacture" out of the inner loop as possible : right now you are creating new FileInfo objects, and path names for every single line you read in every file : that's got to be hugely expensive.
make a single pass over the list of target files first, and create, at one time, the destination files, perhaps store them in a List for easy access, later. Or a Dictionary where "string" will be the new file path associated with that FileInfo ? Another strategy : just copy the whole directory once, and then operate to directly change the copied files : then rename them, rename the directory, whatever.
move every variable declaration out of that inner loop, and within the using code blocks you can.
I suspect you are going to hear from someone here at more of a "guru level" shortly who might suggest a different strategy based on a more profound knowledge of streams than I have, but that's a guess.
Good luck !

Categories

Resources