How can I iterate over an excel file

How can I iterate over an excel file - c#

How can I iterate over an excel file?
I currently have a class using ExcelDataReader
Class => https://paste.lamlam.io/nomehogupi.cs#14fXzlopygZ27adDcXEDtQHT0tWTxoYR
I have a excel file with 5 columns
This is my current code, but it is not exporting the result that I expect ...
TextWriter stream = new StreamWriter("excel Path");
//foreach string in List<string>
foreach(var item in ComboList) {
var rows = ExcelHelper.CellValueCollection(item.Key);
foreach(var row in rows) {
stream.WriteLine(item.Key + "|" + row);
break;
}
}
stream.Close();
My result:
Column1|Row1
Column1|Row2
Column1|Row3
...
Column2|Row1
Column2|Row2
Column2|Row3
...
Expected:
Column1|Row1|Column2|Row1...
Column1|Row2|Column2|Row2...
Column1|Row3|Column2|Row3...
Thanks

This is the answer, I just needed to get the dataSet and iterate over it, very easy
var data = ExcelHelper.DataSet();
foreach (DataRow dr in data.Tables[0].Rows)
{
Console.WriteLine(dr["Column1"] + "|" + dr["Column2"]);
}

If I understand what you want truly! I think you need to add a method like RowValueCollection to your ExcelHelper as below:
public static IEnumerable<string[]> RowValueCollection()
{
var result = Data.Tables[0].Rows.OfType<DataRow>()
.Select(dr => dr.ItemArray.Select(ia => ia.ToString()).ToArray());
return result;
}
And then use it like this:
var rowValues = ExcelHelper.RowValueCollection();
foreach (var row in rowValues)
{
stream.WriteLine(string.Join("|", row));
}
HTH ;)

The first problem is that you are asking to write two items and only two items on a single line.
Would it help if you made the stream.writeline() statement into a .write() statement and then, after the inner loop performed a .writeline() which would terminate the line?
Apologies for not commenting but not enough Respect points to do so.
- Malc

Related

Not getting all URL's with loop

I am looping throw my docTab.Rows which is a Dataset Table connected to a method which is returning six results.
What I am trying to do is to loop throw does results get the field from my table in which I am interested most URL in my case. and then set this URL as a path so that I can copy all the files that I need
foreach (var row in docTab.Rows)
{
var sourceFile = "//ch-s-0001535/G/inetpub/DocAddWeb/DataSource/"+docTab.Rows[0]["URL"].ToString();
string targetPath = rootFolderAbsolutePath;
File.Copy(sourceFile, rootFolderAbsolutePath+Path.GetFileName(sourceFile),overwrite:true);
}
My issue is that I only get 1 file and always the same , never seen the other six even do my loop goes throw 6 times

foreach (DataRow row in docTab.Rows)
{
var sourceFile = "//ch-s-0001535/G/inetpub/DocAddWeb/DataSource/" + row["URL"].ToString();
//Your code
}

Replace foreach 'var' to 'DataRow'. Then it will loop through all rows of dataset table
foreach (DataRow row in docTab.Rows)
{
var sourceFile = "//ch-s-0001535/G/inetpub/DocAddWeb/DataSource/" + row["URL"].ToString();
//Your code
}

Faster way to parse 400,000 strings and load into a user defined table

I have to parse around 40,000 strings and load into DataTable it is taking like forever to parse the string and load. Could any one suggest me a faster method.?
//sameple string
00001000200|something|something|999999999999|999999999999
var loadNdcs = new List<string>();
DataTable table=new DataTable();
table.BeginLoadData();
foreach (string line in lines)
{
string[] vals = line.Split(new[] { "|" }, StringSplitOptions.None);
if (!loadNdcs.Contains(vals[0]))
{
if (vals[3] == "99999999")
vals[3] = null;
if (vals[4] == "99999999")
vals[4] = null;
table.LoadDataRow(vals, true);
loadNdcs.Add(vals[0]);
}
}
table.EndLoadData();

One optimization would be to use a HashSet with O(1) lookup instead of a list.
var loadNdcs = new HashSet<string>();
...
if (loadNdcs.Add(vals[0]))
{
...
}
You might also want to use table.Rows.Add(vals) instead of table.LoadDataRow(vals, true) to avoid unnecessary update if that is not needed. Also, if you have a table.EndLoadData() you probably want a table.BeginLoadData()

If the data is coming in from a file, I would not load the whole file, but use a System.IO.StringReader and call ReadLine().
Will it be faster? Maybe. Will it use less memory? Definitely.

Is there a way to dynamically create an object at run time in .NET 3.5?

I'm working on an importer that takes tab delimited text files. The first line of each file contains 'columns' like ItemCode, Language, ImportMode etc and there can be varying numbers of columns.
I'm able to get the names of each column, whether there's one or 10 and so on. I use a method to achieve this that returns List<string>:
private List<string> GetColumnNames(string saveLocation, int numColumns)
{
var data = (File.ReadAllLines(saveLocation));
var columnNames = new List<string>();
for (int i = 0; i < numColumns; i++)
{
var cols = from lines in data
.Take(1)
.Where(l => !string.IsNullOrEmpty(l))
.Select(l => l.Split(delimiter.ToCharArray(), StringSplitOptions.None))
.Select(value => string.Join(" ", value))
let split = lines.Split(' ')
select new
{
Temp = split[i].Trim()
};
foreach (var x in cols)
{
columnNames.Add(x.Temp);
}
}
return columnNames;
}
If I always knew what columns to be expecting, I could just create a new object, but since I don't, I'm wondering is there a way I can dynamically create an object with properties that correspond to whatever GetColumnNames() returns?
Any suggestions?

For what it's worth, here's how I used DataTables to achieve what I wanted.
// saveLocation is file location
// numColumns comes from another method that gets number of columns in file
var columnNames = GetColumnNames(saveLocation, numColumns);
var table = new DataTable();
foreach (var header in columnNames)
{
table.Columns.Add(header);
}
// itemAttributeData is the file split into lines
foreach (var row in itemAttributeData)
{
table.Rows.Add(row);
}
Although there was a bit more work involved to be able to manipulate the data in the way I wanted, Karthik's suggestion got me on the right track.

You could create a dictionary of strings where the first string references the "properties" name and the second string its characteristic.

Problem removing row in datatable while enumerating

I get the following error while I try to delete a row while looping through it.
C#: Collection was modified; enumeration operation may not execute
I've been doing some research for a while, and I've read some similar posts here, but I still haven't found the right answer.
foreach (DataTable table in JobsDS.Tables)
{
foreach (DataRow row in table.Rows)
{
if (row["IP"].ToString() != null && row["IP"].ToString() != "cancelled")
{
string newWebServiceUrl = "http://" + row["IP"].ToString() + "/mp/Service.asmx";
webService.Url = newWebServiceUrl;
string polledMessage = webService.mpMethod(row["IP"].ToString(), row["ID"].ToString());
if (polledMessage != null)
{
if (polledMessage == "stored")
{
removeJob(id);
}
}
}
}
}
any help would be greatly appreciated

Instead of using foreach, use a reverse for loop:
for(int i = table.Rows.Count - 1; i >= 0; i--)
{
DataRow row = table.Rows[i];
//do your stuff
}
Removing the row indeed modifies the original collection of rows. Most enumerators are designed to explode if they detect the source sequence has changed in the middle of an enumeration - rather than try to handle all the weird possibilities of foreaching across something that is changing and probably introduce very subtle bugs, it is safer to simply disallow it.

You cannot modify a collection inside of a foreach around it.
Instead, you should use a backwards for loop.

If you want to remove Elements from a loop on a list of Elements, the trick is to use a for loop, start from the last Element and go to the first Element.
In your example :
int t_size = table.Rows.Count -1;
for (int i = t_size; i >= 0; i--)
{
DataRow row = table.Rows[i];
// your code ...
}
Edit : not quick enough :)

Also, if you depend on the order that you process the rows and a reverse loop does not work for you. You can add the rows that you want to delete to a List and then after you exit the foreach loop you can delete the rows added to the list. For example,
foreach (DataTable table in JobsDS.Tables)
{
List<DataRow> rowsToRemove = new List<DataRow>();
foreach (DataRow row in table.Rows)
{
if (row["IP"].ToString() != null && row["IP"].ToString() != "cancelled")
{
string newWebServiceUrl = "http://" + row["IP"].ToString() + "/mp/Service.asmx";
webService.Url = newWebServiceUrl;
string polledMessage = webService.mpMethod(row["IP"].ToString(), row["ID"].ToString());
if (polledMessage != null)
{
if (polledMessage == "stored")
{
//removeJob(id);
rowsToRemove.Add(row);
}
}
}
}
rowsToRemove.ForEach(r => removeJob(r["ID"].ToString()));
}

Somehow removeJob(id) changes one of the IEnumerables your enumerating (table.Rows or JobsDS.Tables, from the name of the method I guess it would be the latter), maybe via DataBinding.
I'm not sure the backwards for is going to work directly because it seems you're removing an element enumerated in the outer foreach from within the inner foreach. It's hard to tell without more info about what happens in removeJob(id).

C# Exception Handling continue on error

I have a basic C# console application that reads a text file (CSV format) line by line and puts the data into a HashTable. The first CSV item in the line is the key (id num) and the rest of the line is the value. However I've discovered that my import file has a few duplicate keys that it shouldn't have. When I try to import the file the application errors out because you can't have duplicate keys in a HashTable. I want my program to be able to handle this error though. When I run into a duplicate key I would like to put that key into a arraylist and continue importing the rest of the data into the hashtable. How can I do this in C#
Here is my code:
private static Hashtable importFile(Hashtable myHashtable, String myFileName)
{
StreamReader sr = new StreamReader(myFileName);
CSVReader csvReader = new CSVReader();
ArrayList tempArray = new ArrayList();
int count = 0;
while (!sr.EndOfStream)
{
String temp = sr.ReadLine();
if (temp.StartsWith(" "))
{
ServMissing.Add(temp);
}
else
{
tempArray = csvReader.CSVParser(temp);
Boolean first = true;
String key = "";
String value = "";
foreach (String x in tempArray)
{
if (first)
{
key = x;
first = false;
}
else
{
value += x + ",";
}
}
myHashtable.Add(key, value);
}
count++;
}
Console.WriteLine("Import Count: " + count);
return myHashtable;
}

if (myHashtable.ContainsKey(key))
duplicates.Add(key);
else
myHashtable.Add(key, value);

A better solution is to call ContainsKey to check if the key exist before adding it to the hash table instead. Throwing exception on this kind of error is a performance hit and doesn't improve the program flow.

ContainsKey has a constant O(1) overhead for every item, while catching an Exception incurs a performance hit on JUST the duplicate items.
In most situations, I'd say check for the key, but in this case, its better to catch the exception.

Here is a solution which avoids multiple hits in the secondary list with a small overhead to all insertions:
Dictionary<T, List<K>> dict = new Dictionary<T, List<K>>();
//Insert item
if (!dict.ContainsKey(key))
dict[key] = new List<string>();
dict[key].Add(value);
You can wrap the dictionary in a type that hides this or put it in a method or even extension method on dictionary.

If you have more than 4 (for example) CSV values, it might be worth setting the value variable to use a StringBuilder as well since the string concatenation is a slow function.

Hmm, 1.7 Million lines? I hesitate to offer this for that kind of load.
Here's one way to do this using LINQ.
CSVReader csvReader = new CSVReader();
List<string> source = new List<string>();
using(StreamReader sr = new StreamReader(myFileName))
{
while (!sr.EndOfStream)
{
source.Add(sr.ReadLine());
}
}
List<string> ServMissing =
source
.Where(s => s.StartsWith(" ")
.ToList();
//--------------------------------------------------
List<IGrouping<string, string>> groupedSource =
(
from s in source
where !s.StartsWith(" ")
let parsed = csvReader.CSVParser(s)
where parsed.Any()
let first = parsed.First()
let rest = String.Join( "," , parsed.Skip(1).ToArray())
select new {first, rest}
)
.GroupBy(x => x.first, x => x.rest) //GroupBy(keySelector, elementSelector)
.ToList()
//--------------------------------------------------
List<string> myExtras = new List<string>();
foreach(IGrouping<string, string> g in groupedSource)
{
myHashTable.Add(g.Key, g.First());
if (g.Skip(1).Any())
{
myExtras.Add(g.Key);
}
}

Thank you all.
I ended up using the ContainsKey() method. It takes maybe 30 secs longer, which is fine for my purposes. I'm loading about 1.7 million lines and the program takes about 7 mins total to load up two files, compare them, and write out a few files. It only takes about 2 secs to do the compare and write out the files.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

How can I iterate over an excel file - c#

This is the answer, I just needed to get the dataSet and iterate over it, very easy var data = ExcelHelper.DataSet(); foreach (DataRow dr in data.Tables[0].Rows) { Console.WriteLine(dr["Column1"] + "|" + dr["Column2"]); }

Related

Not getting all URL's with loop

Faster way to parse 400,000 strings and load into a user defined table

Is there a way to dynamically create an object at run time in .NET 3.5?

Problem removing row in datatable while enumerating

C# Exception Handling continue on error

Categories

Resources