Concat While looping - c#

I have a method which parses XML for the start and end date through the xpath.
List<string> GetWorkNodeDate = new List<string>();
List<string> GetWorkNodeEstablisment = new List<string>();
XmlNodeList WorkNodeListDate = XmlResponceDoc.SelectNodes("/response/employees/employee/starttime | /response/employees/employee/endtime");
if (WorkNodeListDate != null)
{
foreach (XmlNode StartDate in WorkNodeListDate)
{
string GetStartDate = StartDate.InnerText;
GetWorkNodeDate.Add(GetStartDate);
}
}
The issue I am having is that every time it has done the loop twice from many which are stored in the xml to gather the start and end date I want to concat them two values then store them in to the List. I was just wondering what is the best way to achieve this? Thanks for any help which you can provide.

I would suggest you use a for-loop instead:
for (int i = 0; i < WorkNodeListDate.Count; i = i + 2)
{
string startAndStoptime = WorkNodeListDate[i].InnerText + "..." + WorkNodeListDate[i + 1];
}
You can access the start and stoptimes and put them together. You'd have to store them somewhere yourself, obviously.
The iterator gets upped with 2 each time, because then you will access another starttime. Also note that it checks if i-1 is smaller than the count. This because i+1 is not another starttime, but a stoptime. So the upping of i has to stop when the list "runs out" of starttimes.
Result would be: starttime...stoptime

Related

For loop changes all elements in the array, rather than just one? [duplicate]

This question already has answers here:
Why does adding a new value to list<> overwrite previous values in the list<>
(2 answers)
Closed 11 months ago.
I'm using a for loop to read the lines of a file one by one. I then use the data in each line of the file to change the properties of the objects in an array.
I've run through the code in Debug mode, and it all seems to run fine. It reads the line of the file which corresponds to the i value of the for loop correctly, it defines an array of this data based on the commas, and it creates a temporary object which stores these values in the correct format. I know all of this because, as I've said, I have checked the values in Debug mode.
However, on the last line inside of the loop it seems to change every element in the array to the values stored in scrOldScore, whereas I want it to keep the values that it read in from
previous lines and just update the element of the array corresponding to the i of the for loop.
With each iteration of the for loop, the array holds identical data in each element that isn't null, and with each iteration that data changes to the most recently defined scrOldScore.
string str;
Data d = new Data();
for (int i = 0; i < arr.Length; i++)
{
str = File.ReadLines(FileName).Skip(i).Take(1).First(); ;
string[] string = new string[4];
string = str.Split(',');
d.Property01 = string[0];
d.Property02 = Convert.ToInt32(string[1]);
d.Property03 = string[2];
d.Property04 = string[3];
arr[i] = d;
}
Thanks for any help :)
The code doesn't work because it's putting the same ScoreData object instance in every array item. You need to create a new ScoreData object instance inside the loop.
But also the original code re-opens and reads through the file a little further on every iteration of the loop. This is grossly inefficient. You can make things run in a fraction of the time by keeping the same file handle, which you can do by using the file, rather than the array, as the main item for the loop:
int i = 0;
var lines = File.ReadLines(strFileName);
foreach(var line in lines)
{
var data = line.Split(',');
ScrScores[i] = new ScoreData() {
Name = data[0],
Score = int.Parse(data[1]),
Accuracy = data[2],
ReactionTime = data[3]
};
i++;
}
Of course, there's a chance here the file might be larger than the array. If that's possible, you should probably be using a List<ScoreData> rather than an array anyway:
var ScrScores = new List<ScrScores>();
var lines = File.ReadLines(FileName);
foreach(var line in lines)
{
var data = line.Split(',');
ScrScores.Add(new ScoreData() {
Name = data[0],
Score = int.Parse(data[1]),
Accuracy = data[2],
ReactionTime = data[3]
});
}
Failing both of these, I would open a StreamReader object, rather than using File.ReadLines(), and call it's ReadLine() method in each for loop iteration.
using (var rdr = new StreamReader(FileName))
{
for (int i = 0; i < ScrScores.Length; i++)
{
var data = rdr.ReadLine().Split(',');
ScrScores[i] = new ScoreData() {
Name = data[0],
Score = int.Parse(data[1]),
Accuracy = data[2],
ReactionTime = data[3]
};
}
}
One last thing we can do, since it looks like we're replacing all the elements in an array of known size, is replace the entire array. To do this we can read exactly that many items from the file and project them into ScoreData objects as we go:
ScrScores = File.ReadLines(FileName).
Select(line => line.Split(',')).
Select(data => new ScoreData() {
Name = data[0],
Score = int.Parse(data[1]),
Accuracy = data[2],
ReactionTime = data[3]
}).
Take(ScrScores.Length).
ToArray();
Technically, this is one line of code (only one semi-colon and could be extended to fit on the same line).
I'll add that hungarian notation variable prefixes like str and arr are a hold-over from the VB6 era. Today, tooling has improved and even Microsoft's own coding guidelines — where the practice originated — now specifically say (in bold type, no less) "Do not use Hungarian notation". All the code examples I provided reflect this recommendation.

Incremental counting and saving all values in one string

I'm having trouble thinking of a logical way to achieve this. I have a method which sends a web request with a for loop that is counting up from 1 to x, the request counts up until it finds a specific response and then sends the URL + number to another method.
After this, saying we got the number 5, I need to create a string which displays as "1,2,3,4,5" but cannot seem to find a way to create the entire string, everything I try is simply replacing the string and only keeping the last number.
string unionMod = string.Empty;
for (int i = 1; i <= count; i++)
{
unionMod =+ count + ",";
}
I assumed I'd be able to simply add each value onto the end of the string but the output is just "5," with it being the last number. I have looked around but I can't seem to even think of what I would search in order to get the answer, I have a hard-coded solution but ideally, I'd like to not have a 30+ string with each possible value and just have it created when needed.
Any pointers?
P.S: Any coding examples are appreciated but I've probably just forgotten something obvious so any directions you can give are much appreciated, I should sleep but I'm on one of those all-night coding grinds.
Thank you!
First of all your problem is the +=. You should avoid concatenating strings because it allocates a new string. Instead you should use a StringBuilder.
Your Example: https://dotnetfiddle.net/Widget/qQIqWx
My Example: https://dotnetfiddle.net/Widget/sx7cxq
public static void Main()
{
var counter = 5;
var sb = new StringBuilder();
for(var i = 1; i <= counter; ++i) {
sb.Append(i);
if (i != counter) {
sb.Append(",");
}
}
Console.WriteLine(sb);
}
As it's been pointed out, you should use += instead of =+. The latter means "take count and append a comma to it", which is the incorrect result you experienced.
You could also simplify your code like this:
int count = 10;
string unionMod = String.Join(",", Enumerable.Range(1, count));
Enumerable.Range generates a sequence of integers between its two parameters and String.Join joins them up with the given separator character.

Have multiple timeout(datetime) values per row in C# DataTable

I have a DataTable with multiple TimeStamp (DateTime) columns per row. I want to create a timeout value so when the TimeStamp passes DateTime.Now-timeoutValue, it will be nulled. And when all TimeStamp values are nulled, the row is deleted.
It's currently implemented with timers and loops. It's starting to get very laggy with many entries, is there a more automated efficient way? Expressions or something? Here are snips of my code:
public ReadsList(object _readers)
{
_readers = List of things that add to datatable
dataTable = new DataTable();
Timeout = 5;
aTimer = new System.Timers.Timer(5000);
aTimer.Elapsed += new ElapsedEventHandler(UpdateReads);
aTimer.Enabled = true;
}
public void Add(object add)
{
//Checks if object exists, update TimeStamp if so, else, add new row
}
private void UpdateReads(object source, ElapsedEventArgs e)
{
//Clean DataTable
foreach (DataRow row in dataTable.Rows.Cast<DataRow>().ToList())
{
int p = 0;
foreach (var i in _readers)
{
p += i.Value;
for (int b = 1; b <= i.Value; b++)
{
if (row[(i.Key + ":" + b)] != DBNull.Value)
{
if (Timeout == 0)
Timeout = 99999;
if (DateTime.Parse(row[(i.Key + ":" + b)].ToString()) <
DateTime.UtcNow.AddSeconds(-1*Timeout))
{
row[(i.Key + ":" + b)] = DBNull.Value;
}
}
else
{
p -= 1;
}
}
}
//Remove Row if empty
if (p == 0)
{
row.Delete();
//readCount -= 1;
}
}
dataTable.AcceptChanges();
OnChanged(EventArgs.Empty);
}
Here are a couple of ideas for minor improvements which may add up to a significant improvement:
You're building the column key (i.Key + ":" + b) more than once. Build it once within your inner foreach and stick it in a variable.
You are reading the column (row[(i.Key + ":" + b)]) more than once. Read it once and stick it in a variable so that you can use it multiple times without having to incur the hash table lookup each time.
You are adjusting the timeout (if (Timeout == 0) Timeout = 99999;) more than once. Adjust it once at the beginning of the method.
You are calculating the timeout DateTime (DateTime.UtcNow.AddSeconds(-1*Timeout)) more than once. Calculating it once at the beginning of the method.
You are always looking up column values by string. If you can store the column ordinals somewhere and use those instead, you'll get better performance. Just make sure you look up the column ordinals once at the beginning of the method, not inside either of the foreaches.
You are parsing strings into DateTimes. If you can store DateTimes in the DataTable, you wouldn't have to parse each time.
First off, there are a few things you can do here to increase the speed. First off, datatables are meant to pull data from a database, and are not really high end collections. In general, Generic Lists are 4x faster than Datatables, and use significantly less memory. Also, your biggest time cost is coming from the DateTime.Parse right in the middle of your third loop, and performing the DateTime calculation right in the middle of the loop for your expiration time. It doesnt appear the expiration time is based upon the records original value, so you should definitely generate that once before the loop starts.
I would recommend creating a data type for your record format, which would allow you to store the records dates as DateTime Objects, basically consuming the conversion time when you first intialize the list, rather than doing the tryparse everytime through. So using a List to store the data, then you could simply do something like :
var cutOffTime = DateTime.UtcNow.AddSeconds(-99999); // Do this to save creating it a billion times.
var totalRecords = AllRecords.Count; // Do this so the value is not re-evaluated
for(var i=0;i<totalRecords;i++)
{
var rec = AllRecords[i];
if(rec.TimeThingy1 < CutOffTime && rec.TimeThingy2 < cutOffTime && rec.TimeThingy3 < cutOffTime)
{
AllRecords.RemoveAt(i); // You could even add this to another list, and remove all at the end, as removing an object from a list during mid-iteration is costly
}
}

C# Best way to parse flat file with dynamic number of fields per row

I have a flat file that is pipe delimited and looks something like this as example
ColA|ColB|3*|Note1|Note2|Note3|2**|A1|A2|A3|B1|B2|B3
The first two columns are set and will always be there.
* denotes a count for how many repeating fields there will be following that count so Notes 1 2 3
** denotes a count for how many times a block of fields are repeated and there are always 3 fields in a block.
This is per row, so each row may have a different number of fields.
Hope that makes sense so far.
I'm trying to find the best way to parse this file, any suggestions would be great.
The goal at the end is to map all these fields into a few different files - data transformation. I'm actually doing all this within SSIS but figured the default components won't be good enough so need to write own code.
UPDATE I'm essentially trying to read this like a source file and do some lookups and string manipulation to some of the fields in between and spit out several different files like in any normal file to file transformation SSIS package.
Using the above example, I may want to create a new file that ends up looking like this
"ColA","HardcodedString","Note1CRLFNote2CRLF","ColB"
And then another file
Row1: "ColA","A1","A2","A3"
Row2: "ColA","B1","B2","B3"
So I guess I'm after some ideas on how to parse this as well as storing the data in either Stacks or Lists or?? to play with and spit out later.
One possibility would be to use a stack. First you split the line by the pipes.
var stack = new Stack<string>(line.Split('|'));
Then you pop the first two from the stack to get them out of the way.
stack.Pop();
stack.Pop();
Then you parse the next element: 3* . For that you pop the next 3 items on the stack. With 2** you pop the next 2 x 3 = 6 items from the stack, and so on. You can stop as soon as the stack is empty.
while (stack.Count > 0)
{
// Parse elements like 3*
}
Hope this is clear enough. I find this article very useful when it comes to String.Split().
Something similar to below should work (this is untested)
ColA|ColB|3*|Note1|Note2|Note3|2**|A1|A2|A3|B1|B2|B3
string[] columns = line.Split('|');
List<string> repeatingColumnNames = new List<string();
List<List<string>> repeatingFieldValues = new List<List<string>>();
if(columns.Length > 2)
{
int repeatingFieldCountIndex = columns[2];
int repeatingFieldStartIndex = repeatingFieldCountIndex + 1;
for(int i = 0; i < repeatingFieldCountIndex; i++)
{
repeatingColumnNames.Add(columns[repeatingFieldStartIndex + i]);
}
int repeatingFieldSetCountIndex = columns[2 + repeatingFieldCount + 1];
int repeatingFieldSetStartIndex = repeatingFieldSetCountIndex + 1;
for(int i = 0; i < repeatingFieldSetCount; i++)
{
string[] fieldSet = new string[repeatingFieldCount]();
for(int j = 0; j < repeatingFieldCountIndex; j++)
{
fieldSet[j] = columns[repeatingFieldSetStartIndex + j + (i * repeatingFieldSetCount))];
}
repeatingFieldValues.Add(new List<string>(fieldSet));
}
}
System.IO.File.ReadAllLines("File.txt").Select(line => line.Split(new[] {'|'}))

How to improve the performance of my custom function for getting fast results?

I;m using Lucene/.NET to implement a numerical search engine.
I want to filter numbers from within a large range, depends on which number exists in string array.
I used the following code:
int startValue = 1;
endValue = 100000;
//Assume that the following string array contains 12000 strings
String[] ArrayOfTerms = new String[] { "1", "10",................. , "99995"};
public String[] GetFilteredStrings(String[] ArrayOfTerms)
{
List<String> filteredStrings = new List<String>();
for (int i = startValue; i <= endValue; i++)
{
int index = Array.IndexOf(ArrayOfTerms,i.ToString());
if( index != -1)
{
filteredStrings.Add((String)ArrayOfTerms.GetValue(index));
}
}
return filteredStrings.ToArray();
}
Now, my problem is it searches every value from 1 to 100000 and takes too much time. some times my application is hanging.
Can anyone of you help me how to improve this performance issue? I don't know about caching concept, but I know that Lucene supports cache filters. Should I use a cache filter? Thanks in advance.
In fact you're trying to determine if an Array contains the item or not.
I think you should use something like a HashSet or Dictionary to be able to determine presence of the value for O(1) time instead of O(n) time you have.
This code works pretty much faster.
var results = ArrayOfTerms.Where(s => int.Parse(s) <= endValue);
If I got what you want to do

Categories

Resources