Importing Data from a csv file - c#

I have a csv file.
When I try to read that file using filestream readtoend(), I get inverted commas and \r at many places that breaks my number of rows in each column.
Is there a way to remove inverted commas and \r.
I tried to replace
FileStream obj = new FileStream();
string a = obj.ReadToEnd();
a.Replace("\"","");
a.Replace("\r\"","");
When I visualize a all \r and inverted commas are removed.
But when I read the file again from beginning using ReadLine() they appear again?

First of all, a String is immutable. You might think this is not important for your question, but actualy it's important whenever you are developing.
If I look at your code snippet, I'm pretty sure you have no knowledge of immutable objects so I advice you to make sure you fully understand the concept.
More information regarding immutable objects can be found: http://en.wikipedia.org/wiki/Immutable_object
Basicly, it means one can never modify a string object. Strings will always point to a new object whenever we change the value.
That's why the Replace method returns a value, which's documentation can be found here: https://msdn.microsoft.com/en-us/library/system.string.replace%28v=vs.110%29.aspx and states clearly that it Returns a new string in which all occurrences of a specified string in the current instance are replaced with another specified string.
In your example, you aren't using the return value of the Replace function.
Could you show us that the string values are actuably being replaced from your a variable? Because I do not believe this is going to be the case. When you visualize a string, carriage returns (\r) are not visual and replaced by an actual carriage return. If you debug and take alook at the actual string value, you should still see the \n.
Take the following code snippet:
var someString = "Hello / world";
someString.Replace("/", "");
Console.Log(someString);
You might think that the console will show "Hello world". However, on this fiddle you can see that it still logs "Hello / World": https://dotnetfiddle.net/cp59i3
What you have to do to correctly use String.Replace can be seen in this fiddle: https://dotnetfiddle.net/XCGtOu
Basicly, you want to log the return value of the Replace function:
var a = "Some / Value";
var b = a.Replace("/", "");
Console.WriteLine(b);
Also, as mentioned by others in the comment section at ur post, you are not replacing the contents of the file, but the string variable in your memory.
If you want to save the new string, make sure to use the Write method of the FileStream (or any other way to write to a file), an explanation can be found here: How to Find And Replace Text In A File With C#
Apart from all what I have been saying throughout this answer, you should not replace both inverted comma's and carriage returns in a file in most cases, they are there for a reason. Unless you do have a specific reason.

At last I succeeded. Thanks to everybody. Here is the code I did.
FileStream obj = new FileStream();
using(StreamReader csvr = new StreamReader(obj))
{
string a = obj.ReadToEnd();
a = a.Replace("\"","");
a = a.Replace("\r\"","");
obj.Dispose();
}
using(StreamWriter Wr = new StreamWriter(TempPath))
{
Wr.Write(a);
}
using(StreamReader Sr = new StreamReader(Tempath))
{
Sr.ReadLine();
}
I Created a temp path on the system. After this things were easy to enter into database.

Try something like this
StreamReader sReader = new StreamReader("filename");
string a = sReader.ReadToEnd();
a.Replace("\"", "");
a.Replace("\r\"", "");
StringReader reader = new StringReader(a);
string inputLine = "";
while ((inputLine = reader.ReadLine()) != null)
{
}

Related

Convert a String, which is already malformed

I have a class, which uses another class which reads a Textfile.
The Textfile is written in Ascii or to be clear CP1525.
Background info: The Textfile is generated in Axapta and uses the ASCIIio class which writes the text by using the writeRaw method
The class which I am using is by a collegue and he is using a C# StreamReader to read files. Normally this works okay because the files are written in UTF8, but in this particular case it isn't.
So the Streamreader reads the file as UTF8 and passes the read string to me.
I now have some letters, like for example the Lating small letter o with Diaeresis (ö) which aren't formated as I would need them to be.
A simple convert of the String doesn't help in this case and I can't figure out how I can get the right letters.
So this is basically how he reads it:
char quotationChar = '"';
String line = "";
using (StreamReader reader = new StreamReader(fileName))
{
if((line = reader.ReadLine()) != null)
{
line = line.Replace(quotationChar.ToString(), "");
}
}
return line;
What now happens is, in the Textfile I have the german word "Röhre" which, after reading it with the streamreader, transforms to R�hre (which looks stupid in a database).
I could try to convert every letter
Encoding enc = Encoding.GetEncoding(1252);
byte[] utf8_Bytes = new byte[line.Length];
for (int i = 0; i < line.Length; ++i)
{
utf8_Bytes[i] = (byte)line[i];
}
String propEncodeString = enc.GetString(utf8_Bytes, 0, utf8_Bytes.Length);
That doesn't give me the right character !
byte[] myarr = Encoding.UTF8.GetBytes(line);
String propEncodeString = enc.GetString(myarr);
That also returns the wrong character.
I am aware that I could just solve the problem by using this:
using (StreamReader reader = new StreamReader(fileName, Encoding.Default, true))
But just for fun:
How can I get the right string from an already wrongly decoded string ?
Once the UTF8 to ASCII conversion is first made, all characters that don't map to valid ASCII entries are replaced with the same bad data character which means that data is just lost and you can't simply 'convert' back to a good character downstream. See this example: https://dotnetfiddle.net/XWysml

Read second line and save it from txt C#

What I have to do is read only the second line in a .txt file and save it as a string, to use later in the code.
The file name is "SourceSetting". In line 1 and 2 I have some words
For line 1, I have this code:
string Location;
StreamReader reader = new StreamReader("SourceSettings.txt");
{
Location = reader.ReadLine();
}
ofd.InitialDirectory = Location;
And that works out great but how do I make it so that it only reads the second line so I can save it as for example:
string Text
You can skip the first line by doing nothing with it, so call ReadLine twice:
string secondLine:
using(var reader = new StreamReader("SourceSettings.txt"))
{
reader.ReadLine(); // skip
secondLine = reader.ReadLine();
}
Another way is the File class that has handy methods like ReadLines:
string secondLine = File.ReadLines("SourceSettings.txt").ElementAtOrDefault(1);
Since ReadLines also uses a stream the whole file must not be loaded into memory first to process it. Enumerable.ElementAtOrDefault will only take the second line and don't process more lines. If there are less than two lines the result is null.
Update I'd advice to go with Tim Schmelter solution.
When you call ReadLine - it moves the carret to next line. So on second call you'll read 2nd line.
string Location;
using(var reader = new StreamReader("SourceSettings.txt"))
{
Location = reader.ReadLine(); // this call will move caret to the begining of 2nd line.
Text = reader.ReadLine(); //this call will read 2nd line from the file
}
ofd.InitialDirectory = Location;
Don't forget about using.
Or an example how to do this vi ReadLines of File class if you need just one line from file. But solution with ElementAtOrDefault is the best one as Tim Schmelter points.
var Text = File.ReadLines(#"C:\Projects\info.txt").Skip(1).First()
The ReadLines and ReadAllLines methods differ as follows: When you use
ReadLines, you can start enumerating the collection of strings before
the whole collection is returned; when you use ReadAllLines, you must
wait for the whole array of strings be returned before you can access
the array. Therefore, when you are working with very large files,
ReadLines can be more efficient.
So it doesn't read all lines into memory in comparison with ReadAllLines.
The line could be read using Linq as follows.
var SecondLine = File.ReadAllLines("SourceSettings.txt").Skip(1).FirstOrDefault();
private string GetLine(string filePath, int line)
{
using (var sr = new StreamReader(filePath))
{
for (int i = 1; i < line; i++)
sr.ReadLine();
return sr.ReadLine();
}
}
Hope this will help :)
If you know that your second line is unique, because it contains a specific keyword that does not appear anywhere else in your file, you also could use linq, the benefit is that the "second" line could be any line in future.
var myLine = File.ReadLines("SourceSettings.txt")
.Where(line => line.Contains("The Keyword"))
.ToList();

Alternative to File.AppendAllText for newline

I am trying to read characters from a file and then append them in another file after removing the comments (which are followed by semicolon).
sample data from parent file:
Name- Harly Brown ;Name is Harley Brown
Age- 20 ;Age is 20 years
Desired result:
Name- Harley Brown
Age- 20
I am trying the following code-
StreamReader infile = new StreamReader(floc + "G" + line + ".NC0");
while (infile.Peek() != -1)
{
letter = Convert.ToChar(infile.Read());
if (letter == ';')
{
infile.ReadLine();
}
else
{
System.IO.File.AppendAllText(path, Convert.ToString(letter));
}
}
But the output i am getting is-
Name- Harley Brown Age-20
Its because AppendAllText is not working for the newline. Is there any alternative?
Sure, why not use File.AppendAllLines. See documentation here.
Appends lines to a file, and then closes the file. If the specified file does not exist, this method creates a file, writes the specified lines to the file, and then closes the file.
It takes in any IEnumerable<string> and adds every line to the specified file. So it always adds the line on a new line.
Small example:
const string originalFile = #"D:\Temp\file.txt";
const string newFile = #"D:\Temp\newFile.txt";
// Retrieve all lines from the file.
string[] linesFromFile = File.ReadAllLines(originalFile);
List<string> linesToAppend = new List<string>();
foreach (string line in linesFromFile)
{
// 1. Split the line at the semicolon.
// 2. Take the first index, because the first part is your required result.
// 3. Trim the trailing and leading spaces.
string appendAbleLine = line.Split(';').FirstOrDefault().Trim();
// Add the line to the list of lines to append.
linesToAppend.Add(appendAbleLine);
}
// Append all lines to the file.
File.AppendAllLines(newFile, linesToAppend);
Output:
Name- Harley Brown
Age- 20
You could even change the foreach-loop into a LINQ-expression, if you prefer LINQ:
List<string> linesToAppend = linesFromFile.Select(line => line.Split(';').FirstOrDefault().Trim()).ToList();
Why use char by char comparison when .NET Framework is full of useful string manipulation functions?
Also, don't use a file write function multiple times when you can use it only one time, it's time and resources consuming!
StreamReader stream = new StreamReader("file1.txt");
string str = "";
while ((string line = infile.ReadLine()) != null) { // Get every line of the file.
line = line.Split(';')[0].Trim(); // Remove comment (right part of ;) and useless white characters.
str += line + "\n"; // Add it to our final file contents.
}
File.WriteAllText("file2.txt", str); // Write it to the new file.
You could do this with LINQ, System.File.ReadLines(string), and System.File.WriteAllLines(string, IEnumerable<string>). You could also use System.File.AppendAllLines(string, IEnumerable<string>) in a find-and-replace fashion if that was, in fact, the functionality you were going for. The difference, as the names suggest, is whether it writes everything out as a new file or if it just appends to an existing one.
System.IO.File.WriteAllLines(newPath, System.IO.File.ReadLines(oldPath).Select(c =>
{
int semicolon = c.IndexOf(';');
if (semicolon > -1)
return c.Remove(semicolon);
else
return c;
}));
In case you aren't super familiar with LINQ syntax, the idea here is to loop through each line in the file, and if it contains a semicolon (that is, IndexOf returns something that is over -1) we cut that off, and otherwise, we just return the string. Then we write all of those to the file. The StreamReader equivalent to this would be:
using (StreamReader reader = new StreamReader(oldPath))
using (StreamWriter writer = new StreamWriter(newPath))
{
string line;
while ((line = reader.ReadLine()) != null)
{
int semicolon = line.IndexOf(';');
if (semicolon > -1)
line = c.Remove(semicolon);
writer.WriteLine(line);
}
}
Although, of course, this would feed an extra empty line at the end and the LINQ version wouldn't (as far as I know, it occurs to me that I'm not one hundred percent sure on that, but if someone reading this does know I would appreciate a comment).
Another important thing to note, just looking at your original file, you might want to add in some Trim calls, since it looks like you can have spaces before your semicolons, and I don't imagine you want those copied through.

How can I replace a unknown string in a file?

Currently I'm writing a library to make reading and writing INI files simple, I have got the reader working and writing when the key and value doesn't exist but I cannot update the value easily, I have tried various methods to replace the string however none are practical and the program requires the old value
Here is an example I have tried from here: Regular Expression to replace unknown value in text file - c# and asp.net
Regex rgx = new Regex(#"SQL-SERVER-VERSION="".*?""");
string result = rgx.Replace(input, replacement);
But every time I modify that code to replace the value it ends up replacing the key instead thus resulting in an error when the app tries to read the file next time.
Here is the code I am using currently:
private string wholedata;
private void UpdateKey(string key, string path, string newval)
{
try
{
using (StreamReader s = new StreamReader(File.Open(path, FileMode.Open)))
{
wholedata = s.ReadToEnd();
Regex rgx = new Regex(key + ".*?");
string result = rgx.Replace(wholedata, newval);
MessageBox.Show(result);
}
using (StreamWriter s = new StreamWriter(File.Open(path, FileMode.OpenOrCreate)))
{
s.Write(wholedata);
}
}
catch (Exception e)
{
}
}
Why rewriting what the system already know how to do? Take a look at this project : An INI file handling class using C#. You should also be aware that Xml has taken a great place in the .Net framework. It's not uncommon to use either the configuration framework or a simple settings object that you serialize/deserialize. I don't know you requirement, but it can enlight us to describe a bit why you want to use ini files.
That said, you are writing the old value (wholedata), not the new value (result). This may be be the root cause.
When you call Regex.Replace (so do string.Replace), it actually generate a new string, and does not change the string passed in parameters.
Change the regex to
Regex rgx = new Regex(#"(?<=SQL-SERVER-VERSION="").*?(?="")");
This will match the .*? on the proviso that the prefix and suffix exist ( (?<=) and (?=) groupings)

issue with XML encoding

I tried to phrase this as a generic question but realized I don't know enough, so here is the problem I'm having.
Here is a snippet from a console application:
public void Run()
{
Run(Console.Out);
}
public void Run(TextWriter writer)
{
DataTable customers = _quickBooksAdapter.GetTableData("Customer");
customers.WriteXml(writer);
}
Then I run it from the console and use ">" to put it in a file.
c:\> QuickBooksETL extract US > qb_us.xml
If i try to load the result as I would normally:
var x = XDocument.Load("qb_us.xml");
I get the error:
Invalid character in the given encoding. Line 8, position 26.
So I tried to determine what .NET "thinks" it is using:
string path = #"\\ad1\accounting$\Xml\qb_us.xml";
StreamReader sr = new StreamReader(path);
sr.CurrentEncoding.Dump();
Result:
System.Text.UTF8Encoding
BodyName utf-8
EncodingName Unicode (UTF-8)
HeaderName utf-8
WebName utf-8
WindowsCodePage 1200
IsBrowserDisplay True
IsBrowserSave True
IsMailNewsDisplay True
IsMailNewsSave True
IsSingleByte False
EncoderFallback 5EncoderReplacementFallback
System.Text.EncoderReplacementFallback
DefaultString �
MaxCharCount 1
DecoderFallback 5DecoderReplacementFallback
System.Text.DecoderReplacementFallback
DefaultString �
MaxCharCount 1
IsReadOnly True
CodePage 65001
Finally, I find by guessing that it works if I just explicitly say it's ASCII:
string path = #"\\ad1\accounting$\Xml\qb_us.xml";
StreamReader sr = new StreamReader(path, Encoding.ASCII);
var x = XDocument.Load(sr);
Any thoughts on where am I going wrong would be greatly appreciated. I admit I have never taken the "deep dive" on character encodings, but I'm willing to put in the effort to get this right.
The simple answer is not to get the console involved. Write directly to the file from your code:
public void Run(string filename)
{
DataTable customers = _quickBooksAdapter.GetTableData("Customer");
customers.WriteXml(filename);
}
or create the TextWriter or Stream yourself and pass that in, e.g.
public void Run(Stream output)
{
DataTable customers = _quickBooksAdapter.GetTableData("Customer");
customers.WriteXml(output);
}
Note that by reading it as ASCII, you'll basically be getting question marks for any non-ASCII character in the original data. IIRC, that's the default behaviour of an encoding when it encounters binary data it can't handle.
Using a Stream it should default to writing out in UTF-8, and the XML declaration and the data within the file should match.
In my experience, if your data includes illegal characters (for example, character 12), the XML doesn't round trip unless you read the XML with an XmlTextReader with Normalization = false. I've been using XmlSerializer.Deserialize(), not XDocument.Load(). Still, you might try calling the Load(XmlReader) overload by passing in an XmlTextReader with Normalization = false.
I would add my voice to Jon's in suggesting that you write to your own stream, not Console.Out.

Categories

Resources