Comparing excel sheet to text file - c#

I have the following data from an excel sheet:
06:07:00 6:07
Data1
Data2
Data3
Data4
06:15:00 06:15
Data5
Data6
Data7
Data8
I want to compare this to the following data from text file:
XXXXXXXXXX 06:08:32 13.0 Data1
XXXXXXXXXX 06:08:45 6.0 Data2
xxxxxxxxxx 06:08:51 5.0 Data3
xxxxxxxxxx 06:08:56 13.0 Data4
xxxxxxxxxx 06:13:44 9.0 Data5
xxxxxxxxxx 06:13:53 11.0 Data6
xxxxxxxxxx 06:14:04 6.0 Data7
xxxxxxxxxx 06:14:10 13.0 Data8
As I want to use the time to compare the two files (excel with text), Time is different for each group. Group1(data1 to Data4), group2 (Data5-data8).
Does anyone have any idea how to go about this situation.
EDIT1:
Here is what I tried to do:
private void doTest(string time)
{
TimeSpan ts = TimeSpan.Parse(time);
int hours = ts.Hours;
int min = ts.Minutes;
int sec = ts.Seconds;
int minstart, minend;
string str;
minstart = min - 5;
minend = min + 5;
while (min != minend)
{
sec = sec + 1;
if (sec < 60)
{
if (hours < 10)
str = hours.ToString().PadLeft(2, '0');
else str = hours.ToString();
if (minstart < 10)
str = str + minstart.ToString().PadLeft(2, '0');
else str = str + minstart.ToString();
if (sec < 10)
str = str + sec.ToString().PadLeft(2, '0');
else str = str + sec.ToString();
chkwithtext(str);
}
else if (sec == 60)
{
sec = 00;
min = min + 1;
str = hours.ToString() + min.ToString() + sec.ToString();
chkwithtext(str);
}
}
}
private void chkwithtext(string str)
{
// check with the text file here if time doesn't match go
// back increment the time with 1sec and then check here again
}

It's not precisely clear how you are 'comparing' the times, but for this answer I'll make the assumption that data from the text file is to be compared if, and only if, its timestamp is within x minutes (defaulting to x = 5) of the Excel timestamp.
My recommendation would be to use an Excel add-in called Schematiq for this - you can download this (approx. 9MB) from http://schematiq.htilabs.com/ (see screenshots below). It's free for personal, non-commercial use. (Disclaimer: I work for HTI Labs, the authors of Schematiq.)
However, I'd do the time handling in Excel. First we'll calculate the start/stop limits for the Excel timestamps. For example, for the first time (06:07:00) we want the range 6:02-6:12. We'll also break the actual, 'start' and 'end' times into hours, minutes and seconds for ease later on. The Excel data sheet looks like this:
Next we need a Schematiq 'template function' which will take the start and end times and return us a range of times. This template is shown here:
The input values to this function are effectively 'dummy' values - the function is compiled internally by Schematiq and can then be called with whatever inputs are required. The 'Result' cell contains text starting with '~#...' (and likewise several of the previous cells) - this indicates a Schematiq data-link containing a table, function or other structure. To view it, you can click the cell and look in the Schematiq Viewer which appears as a task pane within Excel like this:
In other words, Schematiq allows you to hold an entire table of data within a single cell.
Now everything is set up, we simply import the text file and get Schematiq to do the work for us. For each 'time group' within the Excel data, a suitable range of times is generated and this is matched against the text file. You are returned all matching data, plus any unmatched data from both Excel and the text file. The necessary calculations are shown here:
Your Excel worksheet is therefore tiny, and clicking on the final cell will display the final results in the Schematiq Viewer. The results, including the Excel data and the 'template calculation', are shown here:
To be clear, what you see in this screenshot is the entire contents of the workbook - there are no other calculations taking place anywhere other than in the actual cells you see.
The 'final results' themselves are shown enlarged here:
This is exactly the comparison you're after (with a deliberately introduced error - Data9 - in the text file, to demonstrate the matching). You can then carry out whatever comparisons or further analysis you need to.
All of the data-links represent the use of Schematiq functions - the syntax is very similar to Excel and therefore easy to pick up. As an example, the call in the final cell is:
=tbl.SelectColumns(D21, {"Data","Text file"}, TRUE)
This selects all columns from the Schematiq table in cell D21 apart from the 'Data' and 'Text file' columns (the final Boolean argument to this function indicates 'all but').
I'd recommend downloading Schematiq and trying this for yourself - I'd be very happy to email you a copy of the workbook I've put together, so it should just run immediately.

I'm not sure if I understand what do you mean, but I'd start with exporting excel file to csv with ; separator - it's way much easier to work this way. Then some simple container class:
public class DataTimeContainer
{
public string Data;
public string TimeValue1 = string.Empty;
public string TimeValue2 = string.Empty;
}
And use it this way:
//Processint first file
List<DataTimeContainer> Container1 = new List<DataTimeContainer>();
string[] lines = File.ReadAllLines("c:\\data1.csv");
string groupTimeValue1 = string.Empty;
string groupTimeValue2 = string.Empty;
foreach (string[] fields in lines.Select(l => l.Split(';')))
{
//iterating over every line, splited by ';' delimiter
if (!string.IsNullOrWhiteSpace(fields[0]))
{
//we're in a line having both values, like:
//06:07:00 ; 6:07
groupTimeValue1 = fields[0];
groupTimeValue2 = fields[1];
}
else
//we're in line looking like this:
// ; DataX
Container1.Add(new DataTimeContainer(){Data = fields[1], TimeValue1 = groupTimeValue1, TimeValue2 = groupTimeValue2});
}
//Processing second file
List<DataTimeContainer> Container2 = new List<DataTimeContainer>();
lines = File.ReadAllLines("c:\\data2.txt");
foreach (string[] fields in lines.Select(l => l.Split(';')))
{
Container2.Add(new DataTimeContainer() { TimeValue1 = fields[1], TimeValue2 = fields[2], Data = fields[3]});
}
DoSomeComparison();
Of course I'm using strings as data types because I do not know what kind of objects they're supposed to be. Let me know how's that working for you.

If this is a one-time comparison, I would recommend just pulling the text file into Excel (using the Text-to-Columns tools if needed) and running a comparison there with the built-in functions.
If however you need to do this frequently, something like Tarec suggested would be a good start. It seems like you're trying to compare separate event logs within a given timespan (?) - your life will be easier if you parse to objects with DateTime properties instead of comparing text strings.

Populate your Data from your 2 sources(excel and text file) into 2 lists .
Make sure that Lists are of same type .
I would recommend Convert your Excel data to Text File Format .. and then populate Each line of text file and Excel file data into string List.
And then you can compare your List by using the LINQ or Enumerable Methods .
Quickest way to compare two List<>

Related

Write the data to specific line C#

I have some data and I want to write them to a specific line in notepad using C#.
For example I have two textboxes and the data inside them are "123 Hello", for textBox1, and "565878 Hello2" for textBox2.
When I press SAVE button, those data will be saved into one file but with different line. I want to save the first data in the first line and the second data in the third line.
How can I do this?
This question is too broad. The simple answer is that you write the two lines to a file, but write a newline (either "\r\n" or Environment.NewLine) between each string. That will put the two strings on different lines. If you want the second string on the third line, then you should write two newlines between each string.
If neither of those are the answer, then you need to be a lot more specific about why not. Is the file empty to start with? What have you tried? Where, specifically, are you getting stuck? What platform?
And I really don't see what this has to do with NotePad.
EDIT:
You have clarified that you are starting with an existing text file and want to replace the content at the specified lines.
This is a more complex thing to do, and may be beyond your skills if you are just starting out. The basic approach is this:
Assuming you can read the entire file into memory, load the file into a string. You will have to parse new lines to find the lines you want to replace. You can then just replace those parts of the string with the new data. When finished, write the file back to disk.
If the file is too big to load into memory, then it becomes much more complex. I'm sorry, but since you've done such a poor job of describing the issue, I'm not going to the trouble of going over the details for this case. And such a task probably falls outside the scope of a stackoverflow answer any way.
If you line numbers are not fixed you can do something like below:
class Program
{
private static void Main()
{
var data = "";
const string data1 = "Data1";//First Data
const string data2 = "Data2";//Second Data
const int line1 = 1;//First Data Line
const int line2 = 3;//Second Data Line
var maxNoOfLines = Math.Max(line1, line2);
for (var i = 1; i <= maxNoOfLines; i++)
{
if (i == line1)
{
data += data1 + Environment.NewLine;
}
else if (i == line2)
{
data += data2 + Environment.NewLine;
}
else
{
data += Environment.NewLine;
}
}
File.WriteAllText(#"C:\NOBACKUP\test.txt", data);
}
}
Otherwise if line numbers are fixed it will be much more simpler. You can just remove the loop from above and hardcode the values.

Resolving File Name Permutations

I am attempting to import a .CSV file into my database which is a table export from an image management system. This system allows end-users to take images and (sometimes) split them into multiple images. There is a column in this report that signifies the file name of the image that I am tracking. If items are split in the image management system, the file name receives an underscore ("_") on the report. The previous file name is not kept. The way the items can possibly exist on the CSV are shown below:
Report 1 # 8:00AM: ABC.PNG
Report 2 # 8:30AM: ABC_1.PNG
ABC_2.PNG
Report 3 # 9:00AM: ABC_1_1.PNG
ABC_1_2.PNG
ABC_2_1.PNG
ABC_2_2.PNG
Report 4 # 9:30AM ABC_1_1_1.PNG
ABC_1_1_2.PNG
ABC_1_2.PNG
ABC_2_1.PNG
ABC_2_2.PNG
I am importing each file name into its own record. When an item is split, I would like to identify the previous version and update the original record, then add the new split record into my database. The key to knowing if an item is split is locating an underscore ("_").
I am not sure what I should do to recreate previous child names, I have to test every previous iteration of the file name to see if it exists. My problem is interpreting the current state of the file name and rebuilding all previous possibilities. I do not need the original name, only the first possible split name up until the current name. The code below shows kind of what I am getting at, but I am not sure how to do this cleanly.
String[] splitName = theStringToSplit.Split('_');
for (int i = 1; i < splitName.Length - 1; i++)
{
//should concat everything between 0 and i, not just 0 and I
//not sure if this is the best way or what I should do
MessageBox.Show(splitName[0] + "_" + splitName[i] + ".PNG");
}
The thing you are looking for is part of string.
So string.Join() might help you joining an array to a delimited string:
It also contains a parameter start index and number of items to use.
string[] s = new string[] { "2", "a", "b" };
string joined = string.Join("_", s, 0 ,3);
// joined will be "2_a_b"
Maybe you are using the wrong tool for you problem. If you want to keep the last "_", you may want to use LastIndexOf() or even Regular Expressions. Anyways: You should not unnecessarily rip of names and re-glue them. If done, do it cultrue invariant and not culture specific (there might be different interpretations of "-" or the low letter of "I".
string fnwithExt = "Abc_12_23.png";
string fn = System.IO.Path.GetFileName(fnwithExt);
int indexOf = fn.LastIndexOf('_');
string part1 = fn.Substring(0, indexOf-1);
string part2 = fn.Substring(indexOf+1);
string part3 = System.IO.Path.GetExtension(fnwithExt);
string original = System.IO.Path.ChangeExtension(part1 + "_"+ part2, part3);

how to get text file rows with no delimiter into array

I have a text file that I'm trying to input into an array called columns.
Each row in the text file belongs to a different attribute in a sub-class I have created.
For example, row 2 in my text file is a date that I would like to pass over...I do not want to use the Split because I do not have a delimiter but I do not know an alternative. I am not fully understanding the below if someone could help. When I try to run it, it says that columns[1] is out of its range...Thank you.
StreamReader textIn =
new StreamReader(
new FileStream(path, FileMode.OpenOrCreate, FileAccess.Read));
//create the list
List<Event> events = new List<Event>();
while (textIn.Peek() != -1)
{
string row = textIn.ReadLine();
string[] columns = row.Split(' ');
Event special = new Event();
special.Day = Convert.ToInt32(columns[0]);
special.Time = Convert.ToDateTime(columns[1]);
special.Price = Convert.ToDouble(columns[2]);
special.StrEvent = columns[3];
special.Description = columns[4];
events.Add(special);
}
Input file sample:
1
8:00 PM
25.00
Beethoven's 9th Symphony
Listen to the ninth and final masterpiece by Ludwig van Beethoven.
2
6:00 PM
15.00
Baseball Game
Come watch the championship team play their archrival--No work stoppages, guaranteed.
Well, one way to do it (though it is a bit ugly) would be to use File.ReadAllLines, and then loop through the array, something like this:
string[] lines = File.ReadAllLines(path);
int index = 0;
while (index < lines.Length)
{
Event special = new Event();
special.Day = Convert.ToInt32(lines[index]);
special.Time = Convert.ToDateTime(lines[index + 1]);
special.Price = Convert.ToDouble(lines[index + 2]);
special.StrEvent = lines[index + 3];
special.Description = lines[index + 4];
events.Add(special);
lines = lines + 5;
}
This is very brittle code - a lot can break it. What if one of the events is missing a line? What if there are multiple blank lines in it? What if one of the Convert.Toxxx methods throws an error?
If you have the option to change the format of the file, I strongly recommend you make it delimited at least. If you can't change the format, you'll need to make the code sample above more robust so that it can handle blank lines, failed conversions, missing lines, etc.
Much, much, much easier to use a delimited file. Even easier to use an XML or JSON file.
Delimited File (CSV)
Let's say you have the same sample input, but this time it's a CSV file, like this:
1,8:00 PM,25.00,"Beethoven's 9th Symphony","Listen to the ninth and final masterpiece by Ludwig van Beethoven."
2,6:00 PM,15.00,"Baseball Game","Come watch the championship team play their archrival--No work stoppages, guaranteed"
I put quotes on the last two items in case there's ever a comma in there, it won't break the parsing.
For CSV files, I like to use the Microsoft.VisualBasic.FileIO.TextFieldParser class, which despite it's name can be used in C#. Don't forget to add a reference to Microsoft.VisualBasic and a using directive (using Microsoft.VisualBasic.FileIO;).
The following code will allow you to parse the above CSV sample:
using (TextFieldParser parser = new TextFieldParser(path))
{
parser.Delimiters = new string[] {","};
parser.TextFieldType = Delimited;
parser.HasFieldsEnclosedInQuotes = true;
string[] parsedLine;
while (!parser.EndOfData)
{
parsedLine = parser.ReadFields();
Event special = new Event();
special.Day = Convert.ToInt32(parsedLine[0]);
special.Time = Convert.ToDateTime(parsedLine[1]);
special.Price = Convert.ToDouble(parsedLine[2]);
special.StrEvent = parsedLine[3];
special.Description = parsedLine[4];
events.Add(special);
}
}
This still has some issues though - you would need to handle cases where there were missing fields and I would recommend using TryParse methods instead of Convert.Toxxx, but it's a little easier (I think) than the non-delimited sampe.
XML File (Using LINQ to XML)
Now let's try it with an XML file and use LINQ to XML to get the data:
<Events>
<Event>
<Day>1</Day>
<Time>8:00 PM</Time>
<Price>25.00</Price>
<Title><![CDATA[Beethoven's 9th Symphone]]></Title>
<Description><![CDATA[Listen to the ninth and final masterpiece by Ludwig van Beethoven.]]></Description>
</Event>
<Event>
<Day>2</Day>
<Time>6:00 PM</Time>
<Price>15.00</Price>
<Title><![CDATA[Baseball Game]]></Title>
<Description><![CDATA[Come watch the championship team play their archrival--No work stoppages, guaranteed]]></Description>
</Event>
</Events>
I've used CDATA for the title and description so that special characters won't break the XML parsing.
This is easily parsed into your Events by the following code:
XDocument doc = XDocument.Load(path);
List<Event> events = (from x in doc.Descendants("Event")
select new Event {
Day = Convert.ToInt32(x.Element("Day").Value),
Time = Convert.ToDateTime(x.Element("Time").Value),
Price = Convert.ToDouble(x.Element("Price").Value),
StrEvent = x.Element("Title").Value,
Description = x.Element("Description").Value
}).ToList();
Of course, this is still not perfect as you still have the possibility of conversion failures or missing elements.
Pipe-Delimited File Example
Per our discussion in the comments, if you want to use the pipe (|), you need to put each event (in its entirety) on one line, like this:
1|8:00 PM|25.00|Beethoven's 9th Symphony|Listen to the ninth and final masterpiece by Ludwig van Beethoven.
2|6:00 PM|15.00,|Baseball Game|Come watch the championship team play their archrival--No work stoppages, guaranteed
You can still use the TextFieldParser example above if you like (just change the delimiter from , to |, or if you want you can use your original code.
Some Final Thoughts
I wanted to also address the original code and show why it wasn't working. The main reason was that you were reading one line at a time, and then splitting on ' '. This would have been a good start if all the fields were on the same line (although it still would have had problems because of spaces in the Time, StrEvent and Description fields), but they weren't.
So when you read the first line (which was 1) and split on ' ', you got one value back (1). When you tried to access the next element of the split array, you got the index out of range error because there was no columns[1] for that line.
Essentially, you were trying to treat each line as if it had all the fields in it, when in reality it was one field per line.
For your given sample file something like
string[] lines = File.ReadAllLines(path);
for (int index = 4; index < lines.Length; index += 5)
{
Event special = new Event();
special.Day = Convert.ToInt32(lines[index - 4]);
special.Time = Convert.ToDateTime(lines[index - 3]);
special.Price = Convert.ToDouble(lines[index - 2]);
special.StrEvent = lines[index - 1];
special.Description = lines[index];
events.Add(special);
}
Would do the job, but like Tim already mentioned, you should consider changing your file format.
delimiters can be deleted if your side column values haven't intersect char or have fix size.by this condition you can read file and split field on it.
if you want to read from file and load data automatically to variables , i suggest Serialize and deSeialize variabls to file but that file isn't text file!

Trying to format string of Datagrid output

Hi i have the following which creates two worksheets in an excel spreadsheet based on the values in a datagrid, I am able to get it working for two datagrids, however i need to do it for 14 datagrids, this is what i have got so far;
var grid1Output = RadGridView1.ToExcelML();
var grid2Output = RadGridView2.ToExcelML().Replace("Worksheet1", "Worksheet2");
var workBook = grid1Output.Replace("</Worksheet>", "</Worksheet>" +
grid2Output.Substring(grid2Output.IndexOf("<Worksheet"),
grid2Output.IndexOf("</Worksheet")- grid2Output.IndexOf("<Worksheet")) + " </Worksheet>");
The above works fine, however I need to do it for 14 gridoutputs in total. My problem is, I am having trouble replacing strings at the right place. How do i do this?
I would probably do it with Linq for XML methods rather than string manipulation, but the choice is yours.
Either way, it shouldn't be that hard to write a method that takes a grid output (I am assuming it's a string), extracts the contents, and returns them. The calling routine then assembles the 14 XML strings and wraps them in a single Worksheet tag.
Here's a stab at it. Bear in mind that I'm not familiar with the RadGridView and the output of ToExcelML, so you probably won't be able to use this code without some modification. I'm making some assumptions that may not be valid.
First, I would create a method that takes an XML string as input. I am assuming that this string is entirely wrapped in a <Worksheetn> tag.
string ExtractWorksheetContents(string excelML, int index)
{
// You might also be able to do this with a regex, depending on how the contents are structured
// Since I don't know enough about the content, I will do this with string manipulation, as
// you did, rather than loading the XML and making assumptions.
string tagName = string.Format("Worksheet{0}", index);
int worksheetStart = excelML.IndexOf("<" + tagName);
int worksheetEnd = excelML.IndexOf("</" + tagName + ">") + tagName.Length + 3);
// Should contain some checks that neither w'sheet start nor end are -1
return excelML.Substring(worksheetStart, worksheetEnd-worksheetStart);
}
Then I would assemble the results. Again, I'm making assumptions about how the XML is structured.
StringBuilder sb = new StringBuilder();
sb.Append("<Worksheet>");
RadGridView[] gridViews = new RadGridView[] { RadGridView1, RadGridView2 .... RadGridView14 };
for(int i=0;i<14; i++)
{
var rgv = gridViews[i];
sb.Append(ExtractWorksheetContents(rgv.ToExcelML(),i+1));
}
sb.Append("</Worksheet>");
var workBook = sb.ToString();
Hope this helps somewhat.

Updating Cells in a DataTable

I'm writing a small app to do a little processing on some cells in a CSV file I have. I've figured out how to read and write CSV files with a library I found online, but I'm having trouble: the library parses CSV files into a DataTable, but, when I try to change a cell of the table, it isn't saving the change in the table!
Below is the code in question. I've separated the process into multiple variables and renamed some of the things to make it easier to debug for this question.
Code
Inside the loop:
string debug1 = readIn.Rows[i].ItemArray[numColumnToCopyTo].ToString();
string debug2 = readIn.Rows[i].ItemArray[numColumnToCopyTo].ToString().Trim();
string debug3 = readIn.Rows[i].ItemArray[numColumnToCopyFrom].ToString().Trim();
string towrite = debug2 + ", " + debug3;
readIn.Rows[i].ItemArray[numColumnToCopyTo] = (object)towrite;
After the loop:
readIn.AcceptChanges();
When I debug my code, I see that towrite is being formed correctly and everything's OK, except that the row isn't updated: why isn't it working? I have a feeling that I'm making a simple mistake here: the last time I worked with DataTables (quite a long time ago), I had similar problems.
If you're wondering why I'm adding another comma in towrite, it's because I'm combining a street address field with a zip code field - I hope that's not messing anything up.
My code is kind of messy, as I'm only trying to edit one file to make a small fix, so sorry.
The easiest way to edit individual column values is to use the DataRow.Item indexer property:
readIn.Rows[i][numColumnToCopyTo] = (object)towrite;
This isn't well-documented, but DataRow.ItemArray's get accessor returns a copy of the underlying data. Here's the implementation, courtesy of Reflector:
public object[] get_ItemArray() {
int defaultRecord = this.GetDefaultRecord();
object[] objArray = new object[this._columns.Count];
for (int i = 0; i < objArray.Length; i++) {
DataColumn column = this._columns[i];
objArray[i] = column[defaultRecord];
}
return objArray;
}
There's an awkward alternative method for editing column values: get a row's ItemArray, modify those values, then modify the row to use the updated array:
object[] values = readIn.Rows[i].ItemArray;
values[numColumnToCopyTo] = (object)towrite;
readIn.Rows.ItemArray = values;
use SetField<> method :
string debug1 = readIn.Rows[i].ItemArray[numColumnToCopyTo].ToString();
string debug2 = readIn.Rows[i].ItemArray[numColumnToCopyTo].ToString().Trim();
string debug3 = readIn.Rows[i].ItemArray[numColumnToCopyFrom].ToString().Trim();
string towrite = debug2 + ", " + debug3;
readIn.Rows[i].SetField<string>(numColumnToCopyTo,towrite);
readIn.AcceptChanges();

Categories

Resources