I'm trying to populate a combobox with the names of the columns in a spreadsheet.
I'm using the spreadsheetlight library. I can set the cell value using the following code where A refers to column name and 1 refers to row name. (Am I right?)
But how can I get the the name of all columns in all sheets?
SLDocument sl = new SLDocument();
sl.SetCellValue("A1", true);
First, get the last column index using SLWorksheetStatistics:
SLWorksheetStatistics stats = sl.GetWorksheetStatistics();
int endColumnIndex = stats.EndColumnIndex;
Then iterate through the columns:
var headers = new List<string>();
for (int i = 1; i <= endColumnIndex; i++){
headers.Add(sl.GetCellValueAsString(1, i));
}
The following will print the values "foo" and "bar" from the column list:
var fileName = "test.xlsx";
var sl = new SLDocument(fileName);
foreach (var sheetName in sl.GetWorksheetNames())
{
SLDocument sheet = new SLDocument(fileName, sheetName);
sheet.SetCellValue("A1", "foo");
sheet.SetCellValue("B1", "bar");
SLWorksheetStatistics stats = sheet.GetWorksheetStatistics();
int endColumnIndex = stats.EndColumnIndex;
var headers = new List<string>();
for (int i = 1; i <= endColumnIndex; i++)
{
headers.Add(sheet.GetCellValueAsString(1, i));
}
foreach (var column in headers)
{
Console.WriteLine(column);
}
Console.ReadKey();
}
Related
I want the elements of the first column of an Excel sheet to be added to the list.
This is my code:
var list = new List<string>();
using (var stream = new MemoryStream())
{
await file.CopyToAsync(stream);
ExcelPackage.LicenseContext = LicenseContext.NonCommercial;
using (var package = new ExcelPackage(stream))
{
ExcelWorksheet worksheet = package.Workbook.Worksheets[0];
var rowcount = worksheet.Dimension.Rows;
var columncount = worksheet.Dimension.Columns;
var column = worksheet.Rows[1].Range.Value;
var range = worksheet.Rows.Range.Value;
var row = worksheet.Columns[1].Range.Value;
list.Add(column.ToString());
}
}
for(int i = 1; i <= columncount; i++)
{
var field1 = worksheet.Cells[1,i].Value as string;
list.Add(field1);
field1 = null;
}
this loop will give the first column names
Assume I have a .csv file with 70 columns, but only 5 of the columns are what I need. I want to be able to pass a method a string array of the columns names that I want, and for it to return a datatable.
private void method(object sender, EventArgs e) {
string[] columns =
{
#"Column21",
#"Column48"
};
DataTable myDataTable = Get_DT(columns);
}
public DataTable Get_DT(string[] columns) {
DataTable ret = new DataTable();
if (columns.Length > 0)
{
foreach (string column in columns)
{
ret.Columns.Add(column);
}
string[] csvlines = File.ReadAllLines(#"path to csv file");
csvlines = csvlines.Skip(1).ToArray(); //ignore the columns in the first line of the csv file
//this is where i need help... i want to use linq to read the fields
//of the each row with only the columns name given in the string[]
//named columns
}
return ret;
}
Read the first line of the file, line.Split(',') (or whatever your delimiter is), then get the index of each column name and store that.
Then for each other line, again do a var values = line.Split(','), then get the values from the columns.
Quick and dirty version:
string[] csvlines = File.ReadAllLines(#"path to csv file");
//select the indices of the columns we want
var cols = csvlines[0].Split(',').Select((val,i) => new { val, i }).Where(x => columns.Any(c => c == x.val)).Select(x => x.i).ToList();
//now go through the remaining lines
foreach (var line in csvlines.Skip(1))
{
var line_values = line.Split(',').ToList();
var dt_values = line_values.Where(x => cols.Contains(line_values.IndexOf(x)));
//now do something with the values you got for this row, add them to your datatable
}
You can look at https://joshclose.github.io/CsvHelper/
Think Reading individual fields is what you are looking for
var csv = new CsvReader( textReader );
while( csv.Read() )
{
var intField = csv.GetField<int>( 0 );
var stringField = csv.GetField<string>( 1 );
var boolField = csv.GetField<bool>( "HeaderName" );
}
We can easily do this without writing much code.
Exceldatareader is an awesome dll for that, it will directly as a datable from the excel sheet with just one method.
here is the links for example:http://www.c-sharpcorner.com/blogs/using-iexceldatareader1
http://exceldatareader.codeplex.com/
Hope it was useful kindly let me know your thoughts or feedbacks
Thanks
Karthik
var data = File.ReadAllLines(#"path to csv file");
// the expenses row
var query = data.Single(d => d[0] == "Expenses");
//third column
int column21 = 3;
return query[column21];
As others have stated a library like CsvReader can be used for this. As for linq, I don't think its suitable for this kind of job.
I haven't tested this but it should get you through
using (TextReader textReader = new StreamReader(filePath))
{
using (var csvReader = new CsvReader(textReader))
{
var headers = csvReader.FieldHeaders;
for (int rowIndex = 0; csvReader.Read(); rowIndex++)
{
var dataRow = dataTable.NewRow();
for (int chosenColumnIndex = 0; chosenColumnIndex < columns.Count(); chosenColumnIndex++)
{
for (int headerIndex = 0; headerIndex < headers.Length; headerIndex++)
{
if (headers[headerIndex] == columns[chosenColumnIndex])
{
dataRow[chosenColumnIndex] = csvReader.GetField<string>(headerIndex);
}
}
}
dataTable.Rows.InsertAt(dataRow, rowIndex);
}
}
}
Is there a way to export a whole table with a nested schema from Google BigQuery using the REST API as a CSV?
There is an example for doing this (https://cloud.google.com/bigquery/docs/exporting-data) with a not nested schema. This works fine on the not nested columns in my table. Here is the code of this part:
PagedEnumerable<TableDataList, BigQueryRow> result2 = client.ListRows(datasetId, result.Reference.TableId);
StringBuilder sb = new StringBuilder();
foreach (var row in result2)
{
sb.Append($"{row["visitorId"]}, {row["visitNumber"]}, {row["totals.hits"]}{Environment.NewLine}");
}
using (var stream = new MemoryStream(Encoding.UTF8.GetBytes(sb.ToString())))
{
var obj = gcsClient.UploadObject(bucketName, fileName, contentType, stream);
}
In BQ there are columns like totals.hits, totals.visits...If I try to address them I got the errormessage that there is not such a column. If I am addressing "totals" I get the objectname "System.Collections.Generic.Dictionary`2[System.String,System.Object]" in the rows in my csv.
Is there any possibility to do something like that? In the end I want my table from GA in BQ as a CSV somewhere else.
It is possible. Select every column you need like in the following shema und flatten everything was need to be flattened.
string query = $#"
#legacySQL
SELECT
visitorId,
visitNumber,
visitId,
visitStartTime,
date,
hits.hitNumber as hitNumber,
hits.product.productSKU as product.productSKU
FROM
FLATTEN(FLATTEN({tableName},hits),hits.product)";
//Creating a job for the query and activating legacy sql
BigQueryJob job = client.CreateQueryJob(query,
new CreateQueryJobOptions { UseLegacySql = true });
BigQueryResults queryResult = client.GetQueryResults(job.Reference.JobId,
new GetQueryResultsOptions());
StringBuilder sb = new StringBuilder();
//Getting the headers from the GA table and write them into the first row of the new table
int count = 0;
for (int i = 0; i <= queryResult.Schema.Fields.Count() - 1; i++)
{
string columenname = "";
var header = queryResult.Schema.Fields[0].Name;
if (i + 1 >= queryResult.Schema.Fields.Count)
columenname = queryResult.Schema.Fields[i].Name;
else
columenname = queryResult.Schema.Fields[i].Name + ",";
sb.Append(columenname);
}
//Getting the data from the GA table and write them row by row into the new table
sb.Append(Environment.NewLine);
foreach (var row in queryResult.GetRows())
{
count++;
if (count % 1000 == 0)
Console.WriteLine($"item {count} finished");
int blub = queryResult.Schema.Fields.Count;
for (Int64 j = 0; j < Convert.ToInt64(blub); j++)
{
try
{
if (row.RawRow.F[Convert.ToInt32(j)] != null)
sb.Append(row.RawRow.F[Convert.ToInt32(j)].V + ",");
}
catch (Exception)
{
}
}
sb.Append(Environment.NewLine);
}
I'm porting application to ASP.Net 5.0 with EF7 and found several problems. One of the issues is MS have dropped DataTable. I'm trying to transfer a bit of code to not use DataTable but read from SQLDataReader and record this into entities I have. I have to read data by columns, but looks like datareader can read only once.
The old code:
Series[] arrSeries = new Series[dt.Columns.Count - 1];
IList<Categories> arrCats = new List<Categories>();
Categories arrCat = new Categories();
foreach (DataColumn dc in dt.Columns)
{
var strarr = dt.Rows.Cast<DataRow>().Select(row => row[dc.Ordinal]).ToList();
if (dc.Ordinal == 0)
{
arrCat.category = strarr.Select(o => new Category { label = o.ToString() }).ToList();
}
else
{
Series s = new Series()
{
seriesname = dc.ColumnName,
renderas = null,
showvalues = false,
data = strarr.Select(o => new SeriesValue { value = o.ToString() }).ToList()
};
arrSeries[dc.Ordinal - 1] = s;
}
}
arrCats.Add(arrCat);
MultiFusionChart fusChart = new MultiFusionChart
{
chart = dictAtts,
categories = arrCats,
dataset = arrSeries
};
return fusChart;
The new code:
Series[] arrSeries = new Series[colColl.Count - 1];
IList<Categories> arrCats = new List<Categories>();
Categories arrCat = new Categories();
arrCat.category = new List<Category>();
for (int i = 0; i < reader.FieldCount; i++)
{
Series s = new Series()
{
seriesname = reader.GetName(i),
renderas = null,
showvalues = false
};
while (reader.Read())
{
if (i == 0)
{
Category cat = new Category();
cat.label = reader.GetValue(i).ToString();
arrCat.category.Add(cat);
}
else
{
SeriesValue sv = new SeriesValue();
sv.value = reader.GetValue(i).ToString();
s.data.Add(sv);
arrSeries[i - 1] = s;
}
}
}
arrCats.Add(arrCat);
MultiFusionChart fusChart = new MultiFusionChart
{
chart = dictAtts,
categories = arrCats,
dataset = arrSeries
};
return fusChart;
Where the code works, it returns null for Series. And I believe this is because reader went to the end while recording Categories. As far as I know it is not possible to reset DataReader?
Is there a way to load column data from DataReader to List? Or maybe there is other way how I can replace DataTable in this example?
You can read value from multiple columns by specifying its index like this
int totalColumns = reader.FieldCount;
for(int i=0;i<totalColumns;i++)
{
var label = reader.GetValue(i);
}
You can select value of all columns by looping through columns.
GetValue takes int as argument which is index of column not the index of row.
The problem is in your for loop. At first when your i is 0, You have initialized a 'Series` object S. After that,
while (reader.Read())
will read your entire Reader while your i will still be zero. SO only the
if(i == 0)
condition will return true and your entire reader will be spent. Afterwards, for i >= 1,
while (reader.Read())
will always return false keeping your array, arrSeries blank. Remove the while loop and directly read value from the dataReader using the index,
reader.GetValue(i);
Is there an easy way to tell EPPlus that a row is a header? Or should I create the headers by specifying a range using SelectedRange, remove it from the sheet and iterate the cells that remain?
I ended up doing this:
class Program
{
static void Main(string[] args)
{
DirectoryInfo outputDir = new DirectoryInfo(#"C:\testdump\excelimports");
FileInfo existingFile = new FileInfo(outputDir.FullName + #"\Stormers.xlsx");
Dictionary<string, string> arrColumnNames = new Dictionary<string,string>() { { "First Name", "" }, { "Last Name", "" }, { "Email Address", "" } };
using (ExcelPackage package = new ExcelPackage(existingFile))
{
ExcelWorksheet sheet = package.Workbook.Worksheets[1];
var q = from cell in sheet.Cells
where arrColumnNames.ContainsKey(cell.Value.ToString())
select cell;
foreach (var c in q)
{
arrColumnNames[c.Value.ToString()] = c.Address;
}
foreach (var ck in arrColumnNames)
{
Console.WriteLine("{0} - {1}", ck.Key, ck.Value);
}
var qValues = from r in sheet.Cells
where !arrColumnNames.ContainsValue(r.Address.ToString())
select r;
foreach (var r in qValues)
{
Console.WriteLine("{0} - {1}", r.Address, r.Value);
}
}
}
}
I needed to enumerate through header and display all the columns headers to my end user. I took Muhammad Mubashir code as base and changed/converted it to extension method and removed hard-coded numbers from it.
public static class ExcelWorksheetExtension
{
public static string[] GetHeaderColumns(this ExcelWorksheet sheet)
{
List<string> columnNames = new List<string>();
foreach (var firstRowCell in sheet.Cells[sheet.Dimension.Start.Row, sheet.Dimension.Start.Column, 1, sheet.Dimension.End.Column])
columnNames.Add(firstRowCell.Text);
return columnNames.ToArray();
}
}
var pck = new OfficeOpenXml.ExcelPackage();
pck.Load(new System.IO.FileInfo(path).OpenRead());
var ws = pck.Workbook.Worksheets["Worksheet1"];
DataTable tbl = new DataTable();
var hasHeader = true;
foreach (var firstRowCell in ws.Cells[1, 1, 1, ws.Dimension.End.Column]){
tbl.Columns.Add(hasHeader ? firstRowCell.Text : string.Format("Column {0}", firstRowCell.Start.Column));
}
var startRow = hasHeader ? 2 : 1;
for (var rowNum = startRow; rowNum <= ws.Dimension.End.Row; rowNum++){
var wsRow = ws.Cells[rowNum, 1, rowNum, ws.Dimension.End.Column];
var row = tbl.NewRow();
foreach (var cell in wsRow){
row[cell.Start.Column - 1] = cell.Text;
}
tbl.Rows.Add(row);
}
I had a similar issue. Here's some code that may help:
using (var package = new ExcelPackage(fileStream))
{
// Get the workbook in the file
var workbook = package.Workbook;
if (workbook != null && workbook.Worksheets.Any())
{
// Get the first worksheet
var sheet = workbook.Worksheets.First();
// Get header values
var column1Header = sheet.Cells["A1"].GetValue<string>();
var column2Header = sheet.Cells["B1"].GetValue<string>();
// "A2:A" means "starting from A2 (1st col, 2nd row),
// get me all populated cells in Column A" (yes, unusual range syntax)
var firstColumnRows = sheet.Cells["A2:A"];
// Loop through rows in the first column, get values based on offset
foreach (var cell in firstColumnRows)
{
var column1CellValue = cell.GetValue<string>();
var column2CellValue = cell.Offset(0, 1).GetValue<string>();
}
}
}
If anyone knows of a more elegant way than cell.Offset, let me know.
I just took ndd code and convert it with using of System Linq.
using System.Linq;
using OfficeOpenXml;
namespace Project.Extensions.Excel
{
public static class ExcelWorksheetExtension
{
/// <summary>
/// Get Header row with EPPlus.
/// <a href="https://stackoverflow.com/questions/10278101/epplus-reading-column-headers">
/// EPPlus Reading Column Headers
/// </a>
/// </summary>
/// <param name="sheet"></param>
/// <returns>Array of headers</returns>
public static string[] GetHeaderColumns(this ExcelWorksheet sheet)
{
return sheet.Cells[sheet.Dimension.Start.Row, sheet.Dimension.Start.Column, 1, sheet.Dimension.End.Column]
.Select(firstRowCell => firstRowCell.Text).ToArray();
}
}
}