I am writing some strings in an Excel file. Sometimes the call to the
StreamWriter.WriteLine()
function unexpectedly creates a "Â" character.
Any idea why?
Update
the code:
StreamWriter writer = new StreamWriter(File.Create(outFile));
string headerline = "";
foreach (DataColumn colum in reportContents.Columns)
{
headerline = headerline + '"' + row[colum].ToString() + '"' + ',';
}
writer.WriteLine(headerline);
the output:
Personal Protection |Post-Retirement Savings|Pre-Retirement Pension|Tax & Estate Planning
Expected output: Personal Protection |Post-Retirement Savings|Pre-Retirement Pension|Tax & Estate Planning
I get the solution:
just i need to specify the default the encoding in StreamWriter like as follows and it works.
StreamWriter writer = new StreamWriter(File.Create(outFile), Encoding.Default);
shuvra
It isn't actually creating an  character - it's just writing data in a different encoding. If you look at the StreamWriter constructor overloads (http://msdn.microsoft.com/en-us/library/system.io.streamwriter.aspx) you can indicate which encoding you want the StreamWriter to write it's data in.
In case you haven't dealt with encoding before: Joel wrote a good article about it at http://www.joelonsoftware.com/articles/Unicode.html
The character set of your code should be "utf-8".
Related
I'm having trouble writing a Tab-delimited File and I've checked around here and have not gotten my answers yet.
So I've got a function that returns the string with the important pieces below (delimiter used and how I build each line):
var delimiter = #"\t";
sb.Append(string.Join(delimiter, itemContent));
sb.Append(Environment.NewLine);
The string returned is like this:
H\t13\t170000000000001\t20150630
D\t1050\t10.0000\tY
D\t1050\t5.0000\tN
And then I write it to a file with this (content below is the string above):
var content = BuildFile(item);
var filePath = tempDirectory + fileName;
// Create the File
using (FileStream fs = File.Create(filePath))
{
Byte[] info = new UTF8Encoding(true).GetBytes(content);
fs.Write(info, 0, info.Length);
}
However, the file output is this with no tabs (opened in notepad++):
H\t13\t170000000000005\t20150630
D\t1050\t20.0000\tN
D\t1050\t2.5000\tY
When it should be more like this (sample file provided):
H 100115980 300010000000003 20150625
D 430181 1 N
D 342130 2 N
D 459961 1 N
Could this be caused by the encoding I used? Appreciate any input you may have, thanks!
Using var delimiter = #"\t";, the variable contains a literal \ plus t. The # syntax disables the backslash as "special". In this case you really want
var delimiter = "\t";
to have a tab character.
There is a typo in your code. The # prefix means that the following string is a literal so #"\t" is a two-character string with the characters \ and t
You should use "\t" without the prefix.
You should consider using a StreamWriter instead of constructing the entire string in memory and writing the raw bytes though. StreamWriter uses UTF-8 by default and allows you to write formatted lines just as you would with Console.WriteLine:
var delimiter ="\t";
using(var writer=new StreamWriter(filePath))
{
var line=string.Join(delimiter, itemContent);
writer.WriteLine(line);
}
I have a csv file.
When I try to read that file using filestream readtoend(), I get inverted commas and \r at many places that breaks my number of rows in each column.
Is there a way to remove inverted commas and \r.
I tried to replace
FileStream obj = new FileStream();
string a = obj.ReadToEnd();
a.Replace("\"","");
a.Replace("\r\"","");
When I visualize a all \r and inverted commas are removed.
But when I read the file again from beginning using ReadLine() they appear again?
First of all, a String is immutable. You might think this is not important for your question, but actualy it's important whenever you are developing.
If I look at your code snippet, I'm pretty sure you have no knowledge of immutable objects so I advice you to make sure you fully understand the concept.
More information regarding immutable objects can be found: http://en.wikipedia.org/wiki/Immutable_object
Basicly, it means one can never modify a string object. Strings will always point to a new object whenever we change the value.
That's why the Replace method returns a value, which's documentation can be found here: https://msdn.microsoft.com/en-us/library/system.string.replace%28v=vs.110%29.aspx and states clearly that it Returns a new string in which all occurrences of a specified string in the current instance are replaced with another specified string.
In your example, you aren't using the return value of the Replace function.
Could you show us that the string values are actuably being replaced from your a variable? Because I do not believe this is going to be the case. When you visualize a string, carriage returns (\r) are not visual and replaced by an actual carriage return. If you debug and take alook at the actual string value, you should still see the \n.
Take the following code snippet:
var someString = "Hello / world";
someString.Replace("/", "");
Console.Log(someString);
You might think that the console will show "Hello world". However, on this fiddle you can see that it still logs "Hello / World": https://dotnetfiddle.net/cp59i3
What you have to do to correctly use String.Replace can be seen in this fiddle: https://dotnetfiddle.net/XCGtOu
Basicly, you want to log the return value of the Replace function:
var a = "Some / Value";
var b = a.Replace("/", "");
Console.WriteLine(b);
Also, as mentioned by others in the comment section at ur post, you are not replacing the contents of the file, but the string variable in your memory.
If you want to save the new string, make sure to use the Write method of the FileStream (or any other way to write to a file), an explanation can be found here: How to Find And Replace Text In A File With C#
Apart from all what I have been saying throughout this answer, you should not replace both inverted comma's and carriage returns in a file in most cases, they are there for a reason. Unless you do have a specific reason.
At last I succeeded. Thanks to everybody. Here is the code I did.
FileStream obj = new FileStream();
using(StreamReader csvr = new StreamReader(obj))
{
string a = obj.ReadToEnd();
a = a.Replace("\"","");
a = a.Replace("\r\"","");
obj.Dispose();
}
using(StreamWriter Wr = new StreamWriter(TempPath))
{
Wr.Write(a);
}
using(StreamReader Sr = new StreamReader(Tempath))
{
Sr.ReadLine();
}
I Created a temp path on the system. After this things were easy to enter into database.
Try something like this
StreamReader sReader = new StreamReader("filename");
string a = sReader.ReadToEnd();
a.Replace("\"", "");
a.Replace("\r\"", "");
StringReader reader = new StringReader(a);
string inputLine = "";
while ((inputLine = reader.ReadLine()) != null)
{
}
Alright, so my question is; I'm trying to save a file to the C: drive in a folder. Now, I know how to do this for regular files like
using(StreamWriter writer = new SteamWriter("c:\\Folder\\TextFile.txt");
What I've been trying to figure out is how I can make it so that the name of text file is the replaced with a variable so Its more like
using(StreamWriter writer = new SteamWriter("c:\\Folder\\Variablegoeshere.txt");
Is there anyway I can do this?
I apologize for my poor question asking skills.
The StreamWriter constructor, like many other constructors and method calls, takes a string argument. You can pass it any string you like. In your first code sample, you're passing the constructor a "string literal" - an unnamed string variable with a constant value. Instead, you can pass a standard string variable, that you construct beforehand. For instance:
string name = // whatever you like
string path = "c:\\Folder\\" + name + ".txt"; // use '+' to combine strings
using (StreamWriter writer = new SteamWriter(path));
I usually like to use the Path.Combine static method when I concatenate path components. Helps me avoid problems with missing or doubled backslashes:
string path = System.IO.Path.Combine("c:\\Folder", name + ".txt");
And, finally, with the string verbatim modifier, you avoid those ugly double-backslashes, that are otherwise necessary because the backslash is the "escape" character in non-verbatim strings:
string path = System.IO.Path.Combine(#"c:\Folder", name + ".txt");
Here's the Microsoft developer reference page for strings in C#. Worth a read, as is the larger C# language reference.
var inputPath = "c:\\Folder\\TextFile.txt";
var folderPath = Path.GetDirectoryName( inputPath );
using ( var writer = new StreamWriter ( Path.Combine( folderPath, "Variablegoeshere.txt" ) )
The following is a line from a UTF-8 file from which I am trying to remove the special char (0X0A), which shows up as a black diamond with a question mark below:
2464577 外國法譯評 True s6620178 Unspecified <1>�1009-672
This is generated when SSIS reads a SQL table then writes out, using a flat file mgr set to code page 65001.
When I open the file up in Notepad++, displays as 0X0A.
I'm looking for some C# code to definitely strip that char out and replace it with either nothing or a blank space.
Here's what I have tried:
string fileLocation = "c:\\MyFile.txt";
var content = string.Empty;
using (StreamReader reader = new System.IO.StreamReader(fileLocation))
{
content = reader.ReadToEnd();
reader.Close();
}
content = content.Replace('\u00A0', ' ');
//also tried: content.Replace((char)0X0A, ' ');
//also tried: content.Replace((char)0X0A, '');
//also tried: content.Replace((char)0X0A, (char)'\0');
Encoding encoding = Encoding.UTF8;
using (FileStream stream = new FileStream(fileLocation, FileMode.Create))
{
using (BinaryWriter writer = new BinaryWriter(stream, encoding))
{
writer.Write(encoding.GetPreamble()); //This is for writing the BOM
writer.Write(content);
}
}
I also tried this code to get the actual string value:
byte[] bytes = { 0x0A };
string text = Encoding.UTF8.GetString(bytes);
And it comes back as "\n". So in the code above I also tried replacing "\n" with " ", both in double quotes and single quotes, but still no change.
At this point I'm out of ideas. Anyone got any advice?
Thanks.
may wanna have a look at regex replacement, for a good example of this, take a look at the post towards the bottom of this page...
http://social.msdn.microsoft.com/Forums/en-US/1b523d24-dab6-4870-a9ca-5d313d1ee602/invalid-character-returned-from-webservice
You can convert the string to a char array and loop through the array.
Then check what char the black diamond is and just remove it.
string content = "blahblah" + (char)10 + "blahblah";
char find = (char)10;
content = content.Replace(find.ToString(), "");
I am reading a csv file which has following data format
14-Sep-12 ALUMINI 31-Dec-12 117.65 119.25 117.65 118.9 116.75 36
14-Sep-12 ALUMINI 30-Nov-12 116.95 118.65 116.8 118.4 116.5 252
14-Sep-12 ALUMINI 31-Oct-12 116.45 118.15 116.05 117.85 116.05 2802
I am reading this data with following code
List<string> sc = new List<string>();
filepath = "abc.csv" ;
FileStream fs = new FileStream(filePath, FileMode.Open, FileAccess.ReadWrite);
StreamReader sr = new StreamReader(fs);
if (fs != null)
{
while ((oneLine = sr.ReadLine()) != null)
{
sc.Add(oneLine);
}
sr.Close();
// Now writing above data in some file , fo and fout are already declared
fo = new FileStream("tempd.txt", FileMode.Append, FileAccess.Write);
fout = new StreamWriter(fo);
foreach (string str in sc)
{
// i am using ' ' as one of my splitter character
char[] splitter = { ' ', ',', '\t' };
string[] sa1 = str.Split(splitter);
string wline = sa1[0] + "," + sa1[1] + "," + sa1[5] + "," + sa1[6] + "," + sa1[7] ;
fout.WriteLine(wline);
}
fout.Close();
}
My biggest problem is first column of of data is 14-Sep-2012 has been changed to 14 Sep 2012 (- is missing). Which is creating problem in my rest of application.
Is there any way by which I can convert date format while reading and writing file, I want to store this date 14-Sep-2012 as 2012-9-14.
I think this is the answer you are looking for.
DateTime d = Convert.ToDateTime(sa1[0]);
string wline = d.ToString("yyyy-MM-dd") + "," + sa1[1] + "," + sa1[5] + "," + sa1[6] + "," + sa1[7];
Your question doesn't make a lot of sense, so I'm going to make some assumptions here. First, you say that it's a CSV file. CSV stands for Comma Separated Values, but when you show the example - I see no commas at all. Am I right in thinking you have opened the CSV in Microsoft Excel and copied from there into your post?
If so, I then I would ask you to open your original CSV file in Notepad (or another text editor) instead. You will likely see then that your original data does not actually have the dashes.
Basically, when you provide a date in as 14 Sep 12 in your CSV file - Excel recognizes that this is a date, but then it formats it with its own default date format, which makes it look like 14-Sep-12 in Excel.
Another thing - you are reading the entire file into a list of strings, and then outputting the entire list back to a new file reformatted. Rather than load all of this in memory, why not just operate one line at a time? Open both your input and output files, read a line from input, manipulate it, and write it to output. Then loop to the next line and close both files when done. You will find this uses much less memory and generally runs faster.
If you want to reformat the dates, that's easy. Just parse the string into a date. Then control the output of your date with a string formatter in .ToString(). I belive Geethanga's answer shows this well, but Date.Parse() is usually preferred over Convert.ToDateTime().