I'm trying to edit some data in a file with Visual Studio C#. I've tried using both
StreamReader and File.ReadAllLines / ReadAllText
Both results give me 3414 lines of content. I've jut used Split('\n') after "ReadAllText". But when I check the use the following command on linux I get the follow results:
cat phase1_promoter_data_PtoP1.txt | wc
Output:
184829 164686174 1101177922
So about 185.000 lines and 165 million words. A word count on Visual Studio gives me about 19 million.
So my question is, am I reading the file wrong or does Visual Studio have a limit on how much data it will read at once? My file takes about about 1 GB space.
Here's the code I use:
try
{
using (StreamReader sr = new StreamReader("phase1_promoter_data_PtoP1.txt"))
{
String line = sr.ReadToEnd();
Console.WriteLine(line);
String[,] data = new String[184829, 891];
//List<String> data2 = new List<String>();
string[] lol = line.Split('\n');
for (int i = 0; i < lol.Length; i++)
{
String[] oneLine = lol[i].Split('\t');
//List<String> singleLine = new List<String>(lol[i].Split('\t'));
for (int j = 0; j < oneLine.Length; j++)
{
//Console.WriteLine(i + " - " + lol.Length + " - " + j + " - " + oneLine.Length);
data[i,j] = oneLine[j];
}
}
Console.WriteLine(data[3413,0]);
}
}
catch (Exception e)
{
Console.WriteLine(e.Message);
}
The file in your dropbox contains 6043 lines.
Both
Debug.Print(File.ReadAllLines(fPath).Count().ToString());
And
Debug.Print(File.ReadAllText(fPath).Split('\n').Count().ToString());
Show the same results (Using VS 2013 .NET 4.5)
I was able to cycle through each line with..
using (var sr = new StreamReader(fPath))
{
while (!sr.EndOfStream)
{
Debug.Print(sr.ReadLine());
}
}
And
foreach(string line in File.ReadAllLines(fPath))
{
Debug.Print(line);
}
Instead of reading the entire file into a string at once, try one of the loops above and build an array as you cycle through.
Related
I have Lua scripts, that uses variables such as:
VERSION_LOCALE = "1.0"
MAX_MONSTERS = 5
FORBIDDEN_MONSTERS = {2827}
I would like to make the variables externally configurable using a simple C# Program.
Load the Lua script from a dialog
Overwrite the file with the modified variables (textbox)
The actual variables retrieved from our Lua script, in our example MAX_MONSTERS should be returned in the textbox.
What is the appropriate way to achieve? Here is what I have tried without success:
https://stackoverflow.com/a/27219221/18756404
Here's what you need to do:
Read the file contents using System.IO.File.ReadAllText()
Split the lines in to an array using String.Split()
Parse the value using String.SubString() and show the value in the text box
Read the value from the text box and update the array
Regenerate the new string using String.Join()
Write the new string to the file using System.IO.File.WriteAllText()
You are correct #Shameel,
I used the code here to read the information:
var lines = File.ReadAllLines(luaPath);
foreach (var line in lines)
{
if (line.Contains("MAX_MONSTERS"))
{
Utilis.LogInfo("VARIABLE NAME: " + line.Split('=')[0].Trim());
Utilis.LogInfo("VARIABLE VALUE: " + line.Split('=')[1].Trim());
Console.WriteLine("------------------------------------------------------------------");
TxtMaxMonsters.Text = line.Split('=')[1].Trim();
TxtMaxMonsters.Enabled = true;
}
if (line.Contains("MIN_MONSTERS"))
{
Utilis.LogInfo("VARIABLE NAME: " + line.Split('=')[0].Trim());
Utilis.LogInfo("VARIABLE VALUE: " + line.Split('=')[1].Trim());
Console.WriteLine("------------------------------------------------------------------");
TxtMinMonsters.Text = line.Split('=')[1].Trim();
TxtMinMonsters.Enabled = true;
}
}
And I used the code below to save the modifications:
var lines = File.ReadAllLines(luaPath);
for (var i = 0; i < lines.Length; i++)
{
var line = lines[i];
if (line.Contains("MAX_MONSTERS"))
{
lines[i] = line.Replace(line.Split('=')[1].Trim(), TxtMaxMonsters.Text);
}
if (line.Contains("MIN_MONSTERS"))
{
lines[i] = line.Replace(line.Split('=')[1].Trim(), TxtMinMonsters.Text);
}
if (line.Contains("ITEMS_TO_DEPOSIT = {"))
{
lines[i] = line.Replace(line.Split('{')[1].Split('}')[0].Trim(), TxtItemsToDepositID.Text);
}
}
File.WriteAllLines(luaPath, lines);
I have written a very simple program using a nuget package in c# to read in 2 csv files and fuzzy match them and output a new csv file with all the matches. The problem is i need the program to be able to read and compare files up to 700k and comparw it to 100k. I havent been able to find a way to speed up the process. Is there any way i can do this? I will even use another language if need be.
you can ignore all the commented code its just there for when i was using it for testing purposes. sorry im a newer programmer.
the read csv funciton is for reading in the csv. the rest is code inside another function where i pass in the string arrays to pass them through fuzzymatch
static string[] ReadCSV(string path)
{
List<string> name = new List<string>();
List<string> address = new List<string>();
List<string> city = new List<string>();
List<string> state = new List<string>();
List<string> zip = new List<string>();
using (var reader = new StreamReader(path))
{
reader.ReadLine();
while (!reader.EndOfStream)
{
var line = reader.ReadLine();
var values = line.Split(',');
name.Add(values[0] +", "+ values[1]);
//address.Add(values[1]);
//city.Add(values[2]);
//state.Add(values[3]);
//zip.Add(values[4]);
}
}
string[] name1 = name.ToArray();
return name1;
//foreach (var item in name)
//{
// Console.WriteLine(item.ToString());
//}
}
StringBuilder csvcontent = new StringBuilder();
string csvpath = #"C:\Users\bigel\Documents\outputtest.csv";
csvcontent.AppendLine("Name,Address,Match");
//Console.WriteLine("Levenshtein Edit Distance:");
int x = 1;
foreach (var name in string1)
{
for (int i = 0; i < length; i++)
{
int leven = match[i].LevenshteinDistance(name);
//Console.WriteLine(match[i] + "\t{0} against {1}", leven, name);
if (leven <= 7)
{
output[i] = input[i] + ",match";
csvcontent.AppendLine(output[i]);
//Console.WriteLine(match[i] + " " + leven + " against " + name + " is a Match");
//Console.WriteLine(output[i]);
}
else
{
if (i == 500)
{
Console.WriteLine(x);
x++;
}
}
}
}
File.AppendAllText(csvpath, csvcontent.ToString());
I was practicing to write into a file using c#
my code is not working (writing in file is not done)
{
int T, N; //T = testCase , N = number of dice in any Test
int index = 0, straight;
List<string> nDiceFaceValues = new List<string>(); //List for Dice Faces
string line = null; //string to read line from file
string[] lineValues = {}; //array of string to split string line values
string InputFilePath = # "E:\Visual Studio 2017\CodeJam_Dice Straight\A-small-practice.in"; //path of input file
string OuputFilePath = #
"E:\Visual Studio 2017\CodeJam_Dice Straight\A-small-practice.out"; //path of otput file
StreamReader InputFile = new StreamReader(InputFilePath);
StreamWriter Outputfile = new StreamWriter(OuputFilePath);
T = Int32.Parse(InputFile.ReadLine()); //test cases input
Console.WriteLine("Test Cases : {0}", T);
while (index < T) {
N = Int32.Parse(InputFile.ReadLine());
for (int i = 0; i < N; i++) {
line = InputFile.ReadLine();
lineValues = line.Split(' ');
foreach(string j in lineValues)
{
nDiceFaceValues.Add(j);
}
}
straight = ArrangeDiceINStraight(nDiceFaceValues);
Console.WriteLine("case: {0} , {1}", ++index, straight);
Outputfile.WriteLine("case: {0} , {1}", index, straight);
nDiceFaceValues.Clear();
}
}
what is wrong with this code?
how I fix it?
why its not working??
Note: I want to write in file line by line
What's missing is: closing things down - flushing the buffers, etc:
using(var outputfile = new StreamWriter(ouputFilePath)) {
outputfile.WriteLine("case: {0} , {1}", index, straight);
}
However, if you're going to do that for every line, File.AppendText may be more convenient.
In particular, note that new StreamWriter will be overwriting by default, so you'd also need to account for that:
using(var outputfile = new StreamWriter(ouputFilePathm, true)) {
outputfile.WriteLine("case: {0} , {1}", index, straight);
}
the true here is for append.
If you have opened a file for concurrent read/write, you could also try just adding outputfile.Flush();, but... it isn't guaranteed to do anything.
this is the code that i've written so far...
it doesnt do the job except re-write every line on the same file over and over again...
*RecordCntPerFile = 10K
*FileNumberName = 1 (file number one)
*Full File name should be something like this: 1_asci_split
string FileFullPath = DestinationFolder + "\\" + FileNumberName + FileNamePart + FileExtension;
using (System.IO.StreamReader sr = new System.IO.StreamReader(SourceFolder + "\\" + SourceFileName))
{
for (int i = 0; i <= (RecordCntPerFile - 1); i++)
{
using (StreamWriter sw = new StreamWriter(FileFullPath))
{
{ sw.Write(sr.Read() + "\n"); }
}
}
FileNumberName++;
}
Dts.TaskResult = (int)ScriptResults.Success;
}
If I understood correctly, you want to split a big file in smaller files with maximum of 10k lines. I see 2 problems on your code:
You never change the FullFilePath variable. So you will always rewrite on the same file
You always read and write the whole source file to the target file.
I rewrote your code to fit the behavior I said earlier. You just have to modify the strings.
int maxRecordsPerFile = 10000;
int currentFile = 1;
using (StreamReader sr = new StreamReader("source.txt"))
{
int currentLineCount = 0;
List<string> content = new List<string>();
while (!sr.EndOfStream)
{
content.Add(sr.ReadLine());
if (++currentLineCount == maxRecordsPerFile || sr.EndOfStream)
{
using (StreamWriter sw = new StreamWriter(string.Format("file{0}.txt", currentFile)))
{
foreach (var line in content)
sw.WriteLine(line);
}
content = new List<string>();
currentFile++;
currentLineCount = 0;
}
}
}
Of course you can do better than that, as you don't need to create that string and do that foreach loop. I just made this quick example to give you the idea. To improve the performance is up to you
I have a C# script which takes in two CSV files as input, combines the two files, performs numerous calculations on them, and writes the result in a new CSV file.
These two input CSV file names are declared as variables and are used in the C# script by accessing those variable names.
The data in the input CSV files looks like this:
Since the data has values in thousands and millions, line splits in the C# code are truncating the data incorrectly. For instance a value of 11,861 appears only as 11 and 681 goes in the next columns.
Is there any way in C#, by which I can specify a text qualifier (" in this case) for the two files ?
Here is the C# code snippet:
string[,] filesToProcess = new string[2, 2] { {(String)Dts.Variables["csvFileNameUSD"].Value,"USD" }, {(String)Dts.Variables["csvFileNameCAD"].Value,"CAD" } };
string headline = "CustType,CategoryType,CategoryValue,DataType,Stock QTY,Stock Value,Floor QTY,Floor Value,Order Count,Currency";
string outPutFile = Dts.Variables["outputFile"].Value.ToString();
//Declare Output files to write to
FileStream sw = new System.IO.FileStream(outPutFile, System.IO.FileMode.Create);
StreamWriter w = new StreamWriter(sw);
w.WriteLine(headline);
//Loop Through the files one by one and write to output Files
for (int x = 0; x < filesToProcess.GetLength(1); x++)
{
if (System.IO.File.Exists(filesToProcess[x, 0]))
{
string categoryType = "";
string custType = "";
string dataType = "";
string categoryValue = "";
//Read the input file in memory and close after done
StreamReader sr = new StreamReader(filesToProcess[x, 0]);
string fileText = sr.ReadToEnd();
string[] lines = fileText.Split(Convert.ToString(System.Environment.NewLine).ToCharArray());
sr.Close();
where csvFileNameUSD and csvFileNameCAD are variables with values pointing to their locations.
Well, based on the questions you have answered, this ought to do what you want to do:
public void SomeMethodInYourCodeSnippet()
{
string[] lines;
using (StreamReader sr = new StreamReader(filesToProcess[x, 0]))
{
//Read the input file in memory and close after done
string fileText = sr.ReadToEnd();
lines = fileText.Split(Convert.ToString(System.Environment.NewLine).ToCharArray());
sr.Close(); // redundant due to using, but just to be safe...
}
foreach (var line in lines)
{
string[] columnValues = GetColumnValuesFromLine(line);
// Do whatever with your column values here...
}
}
private string[] GetColumnValuesFromLine(string line)
{
// Split on ","
var values = line.Split(new string [] {"\",\""}, StringSplitOptions.None);
if (values.Count() > 0)
{
// Trim leading double quote from first value
var firstValue = values[0];
if (firstValue.Length > 0)
values[0] = firstValue.Substring(1);
// Trim the trailing double quote from the last value
var lastValue = values[values.Length - 1];
if (lastValue.Length > 0)
values[values.Length - 1] = lastValue.Substring(0, lastValue.Length - 1);
}
return values;
}
Give that a try and let me know how it works!
You posted a very similar looking question few days ago. Did that solution not help you?
If so, what issues are you facing on that. We can probably help you troubleshoot that as well.