Currently, we generate a shipping manifest file using VBA. I was able to take care of the grunt work of harmonizing, and getting the dataset I need, converted to C#, my last step is to generate the XML file in C# is as well.
The way it works currently, VBA opens a recordset called tblXMLcodes which has 4 fields, Tag, Carrier, Section, and orderct.. Tag has each tag I need, carrier is always the same, section is either Item, or Package as I have two data sets (a package query and an item query) Then the orderct field is just numbered 1-16 for each section.
Then another record set is opened from the item query AND the package query and it gets looped through appending to a .txt file..
How can I easily take my two datasets (both queries) and loop through them to generate an XML file..
Are there any NuGet packages?
Any help is appreciated.. Currently my code works so I wont post it, but if someone wants to see the current I will gladly share. It is VERY slow....
follow this code
DataSet ds = new DataSet();
ds.Tables.Add(dt1); // Table 1
ds.Tables.Add(dt2); // Table 2...
...
string dsXml= ds.GetXml();
...
using (StreamWriter fs = new StreamWriter(xmlFile)) // XML File Path
{
ds.WriteXml(fs);
}
Related
I've been working on a VB application which parses multiple XML files, and create an Excel file from them.
The main problem of this is that I am, simply, reading each line of each XML and outputs them to the Excel file when a specific node is found. I would like to know if exists any method to store the data from each element, just to use it once everything (all the XML files) have been parsed.
I was thinking about databases but I think this is excessive and unnecesary. Maybe you can give me some ideas in order to make it working.
System.Data.DataSet can be used as an "in memory database".
You can use a DataSet to store information in memory - a DataSet can contain multiple DataTables and you can add columns to those at runtime, even if there are already rows in the DataTable. So even if you don't know the XML node names ahead of time, you can add them as columns as they appear.
You can also use DataViews to filter the data inside the DataSet.
My typical way of pre-parsing XML is to create a two-column DataTable with the XPATH address of each node and its value. You can then do a second pass that matches XPATH addresses to your objects/dataset.
We have created a custom dataset and populating it with some data.
Before adding data, we are adding columns in the data set as follows
DataSet archiveDataset = new DataSet("Archive");
DataTable dsTable = archiveDataset.Tables.Add("Data");
dsTable.Columns.Add("Id", typeof(int));
dsTable.Columns.Add("Name", typeof(string));
dsTable.Columns.Add("LastOperationBy", typeof(int));
dsTable.Columns.Add("Time", typeof(DateTime))
Once the Dataset is create, we are filling values as follows
DataRow dataRow = dsTable.NewRow();
dataRow["Id"] = source.Id;
dataRow["Name"] = source.Name;
dataRow["LastOperationBy"] = source.LastOperationBy;
dataRow["Time"] = source.LaunchTime;
Is there any better and managed way of doing this. can I make the code more easy to write using enum or anything else to reduce the efforts?
You could try using a Typed Dataset.
This should get rid of the ["<column_name>"] ugliness.
If the dataset has a structure similar to tables in a database, then Visual Studio makes it really easy to create one: just click Add -> New Item somewhere in the solution and choose DataSet. VS will show a designer where you can drag tables from your server explorer.
Update (after response to Simon's comment):
A typed dataset is in fact an XSD (XML Schema Definition).
What I did in a similar case was:
created an empty DataSet (using Add -> New Item -> DataSet)
opened the newly created file with a text editor (by dafault, in VS it shows the XSD designer)
paste the XSD that I had created manually
You could also choose to use the designer to create the schema.
Considering your comment "I am using Dataset to export data to a XML file" I recommend using a different technology such as
Linq to XML http://msdn.microsoft.com/en-us/library/bb387061.aspx or
Xml Serialzation http://msdn.microsoft.com/en-us/library/system.xml.serialization.xmlserializer.aspx
Or better yet of is doesnt have to be XML (and you only want hierarchical readable text consider JSON instead http://james.newtonking.com/pages/json-net.aspx
You can bind dataset in two way first one is using database second one is add manually.
After create column for dataset you can add using Loops you can add it if you have 10000 of entries.
You can use Reflection. Another option is to use EntityFramework or NHibernate to map the columnnames and datastructure columns and then avoid these code to fill each field manually. But they will add more complexity. Also Performance wise the your code is better.
How do I import data in Excel from a CSV file using C#? Actually, what I want to achieve is similar to what we do in Excel, you go to the Data tab and then select From Text option and then use the Text to columns option and select CSV and it does the magic, and all that stuff. I want to automate it.
If you could head me in the right direction, I'll really appreciate that.
EDIT: I guess I didn't explained well. What I want to do is something like
Excel.Application excelApp;
Excel.Workbook excelWorkbook;
// open excel
excelApp = new Excel.Application();
// something like
excelWorkbook.ImportFromTextFile(); // is what I need
I want to import that data into Excel, not my own application. As far as I know, I don't think I would have to parse the CSV myself and then insert them in Excel. Excel does that for us. I simply need to know how to automate that process.
I think you're over complicating things. Excel automatically splits data into columns by comma delimiters if it's a CSV file. So all you should need to do is ensure your extension is CSV.
I just tried opening a file quick in Excel and it works fine. So what you really need is just to call Workbook.Open() with a file with a CSV extension.
You could open Excel, start recording a macro, do what you want, then see what the macro recorded. That should tell you what objects to use and how to use them.
I beleive there are two parts, one is the split operation for the csv that the other responder has already picked up on, which I don't think is essential but I'll include anyways. And the big one is the writing to the excel file, which I was able to get working, but under specific circumstances and it was a pain to accomplish.
CSV is pretty simple, you can do a string.split on a comma seperator if you want. However, this method is horribly broken, albeit I'll admit I've used it myself, mainly because I also have control over the source data, and know that no quotes or escape characters will ever appear. I've included a link to an article on proper csv parsing, however, I have never tested the source or fully audited the code myself. I have used other code by the same author with success. http://www.boyet.com/articles/csvparser.html
The second part is alot more complex, and was a huge pain for me. The approach I took was to use the jet driver to treat the excel file like a database, and then run SQL queries against it. There are a few limitations, which may cause this to not fit you're goal. I was looking to use prebuilt excel file templates to basically display data and some preset functions and graphs. To accomplish this I have several tabs of report data, and one tab which is raw_data. My program writes to the raw_data tab, and all the other tabs calculations point to cells in this table. I'll go into some of the reasoning for this behavior after the code:
First off, the imports (not all may be required, this is pulled from a larger class file and I didn't properly comment what was for what):
using System.IO;
using System.Diagnostics;
using System.Data.Common;
using System.Globalization;
Next we need to define the connection string, my class already has a FileInfo reference at this point to the file I want to use, so that's what I pass on. It's possible to search on google what all the parameters are for, but basicaly use the Jet Driver (should be available on ANY windows install) to open an excel file like you're referring to a database.
string connectString = #"Provider=Microsoft.Jet.OLEDB.4.0;Data Source={filename};Extended Properties=""Excel 8.0;HDR=YES;IMEX=0""";
connectString = connectString.Replace("{filename}", fi.FullName);
Now let's open up the connection to the DB, and be ready to run commands on the DB:
DbProviderFactory factory = DbProviderFactories.GetFactory("System.Data.OleDb");
using (DbConnection connection = factory.CreateConnection())
{
connection.ConnectionString = connectString;
using (DbCommand command = connection.CreateCommand())
{
connection.Open();
Next we need the actual logic for DB insertion. So basically throw queries into a loop or whatever you're logic is, and insert the data row-by-row.
string query = "INSERT INTO [raw_aaa$] (correlationid, ipaddr, somenum) VALUES (\"abcdef", \"1.1.1.1", 10)";
command.CommandText = query;
command.ExecuteNonQuery();
Now here's the really annoying part, the excel driver tries to detect you're column type before insert, so even if you pass a proper integer value, if excel thinks the column type is text, it will insert all you're numbers as text, and it's very hard to get this treated like a number. As such, excel must already have the column type as the number. In order to accomplish this, for my template file I fill in the first 10 rows with dummy data, so that when you load the file in the jet driver, it can detect the proper types and use them. Then all my forumals that point at my csv table will operate properly since the values are of the right type. This may work for you if you're goals are similar to mine, and to use templates that already point to this data (just start at row 10 instead of row 2).
Because of this, my raw_aaa tab in excel might look something like this:
correlationid ipaddr somenum
abcdef 1.1.1.1 5
abcdef 1.1.1.1 5
abcdef 1.1.1.1 5
abcdef 1.1.1.1 5
abcdef 1.1.1.1 5
abcdef 1.1.1.1 5
abcdef 1.1.1.1 5
abcdef 1.1.1.1 5
Note row 1 is the column names that I referenced in my sql queries. I think you can do without this, but that will require a little more research. By already having this data in the excel file, the somenum column will be detected as a number, and any data inserted will be properly treated as such.
Antoher note that makes this annoying, the Jet Driver is 32-bit only, so in my case where I had an explicit 64-bit program, I was unable to execute this directly. So I had the nasty hack of writing to a file, then launch a program that would insert the data in the file into my excel template.
All in all, I think the solution is pretty nasty, but thus far haven't found a better way to do this unfortunatly. Good luck!
You can take a look at TakeIo.Spreadsheet .NET library. It accepts files from Excel 97-2003, Excel 2007 and newer, and CSV format (semicolon or comma separators).
Example:
var inputFile = new FileInfo("Book1.csv"); // could be .xls or .xlsx too
var sheet = Spreadsheet.Read(inputFile);
foreach (var row in sheet)
{
foreach (var cell in row)
{
// do something
}
}
You can remove beginning and trailing empty rows, and also beginning and trailing columns from the imported data using the Normalize() function:
sheet.Normalize();
Sometimes you can find that your imported data contains empty rows between data, so you can use another helper for this case:
sheet.RemoveEmptyRows();
There is a Serialize() function to convert any input to CSV too:
var outfile = new StreamWriter("AllData.csv");
sheet.Serialize(outfile);
If you like to use comma instead of the default semicolon separator in your CSV file, do:
sheet.Serialize(outfile, ',');
And yes, there is also a ToString() function too...
This package is available at NuGet too, just take a look at TakeIo.Spreadsheet.
You can use ADO.NET
http://vbadud.blogspot.com/2008/09/opening-comma-separate-file-csv-through.html
Well, importing from CSV shouldn't be a big deal. I think the most basic method would be to do it using string operations. You could build a pretty fine parser using simple Split() command, and getting the stuff in arrays.
I am reading a text file in C# and trying to save it to a SQL database. I am fine except I don't want the first line, which is the names of the columns, included in the import. What's the easiest way to exclude these?
The code is like this
while (textIn.Peek() != -1)
{
string row = textIn.ReadLine();
string[] columns = row.Split(' ');
Product product = new Product();
product.Column1 = columns[0];
etc.....
product.Save();
}
thanks
If you are writing the code yourself to read in the file and then importing...why don't you just skip over the first line?
Here's my suggestion:
string[] file_rows;
using(var reader=File.OpenText(filepath))
{
file_rows=reader.ReadToEnd().Split("\r\n");
reader.Close();
}
for(var i=1;i<file_rows.Length;i++)
{
var row=file_rows[i];
var cells=row.Split("\t");
....
}
how are you importing the data? if you are looping in C# and inserting them one at a time, construct your loop to skip the first insert!
or just delete the first row inserted after they are there.
give more info, get more details...
Pass a flag into the program (in case in future the first line is also data) that causes the program to skip the first line of text.
If it's the same column names as used in the database you could also parse it to grab the column names from that instead of hard-coding them too (assuming that's what you're doing currently :)).
As a final note, if you're using a MySQL database and you have command line access, you may want to look at the LOAD DATA LOCAL INFILE syntax which lets you import pretty arbitrarily defined CSV data.
For future reference have a look at this awesome package: FileHelpers Library
I can't add links just yet but google should help, it's on sourceforge
It makes our lives here a little easier when people insist on using files as integration
I am working on a feather that export some tables(~50) to a disk file and import the file back to database. Export is quite easy, serialize dataset to a file stream. But when importing: table structure need to be determined dynamically.What I am doing now :
foreach table in dataset
(compare table schemas that in db and imported dataset)
define a batch command
foreach row in table
contruct a single insert sqlcommand,add it to batch command
execute batch insert command
this is very inefficient and I also I meet some problem to convert datatype in dataset datatable to database datatable. So I want to know is there some good method to do so?
Edit:
In fact, import and export is 2 functions(button) in program, On UI, there is a grid that list lots of tables, what I need to implement is to export selected tables's data to a disk file and import data back to database later
Why not use SQL Server's native Backup and Restore functionality? You can do incremental Restores on the data, and it's by far the fastest way to export and then import data again.
There are a lot of very advanced options to take into account some fringe cases, but at it's heart, it's two commands: Backup Database and Restore Database.
backup database mydb to disk = 'c:\my\path\to\backup.bak'
restore database mydb from disk = 'c:\my\path\to\backup.bak'
When doing this against TB-sized databases, it takes about 45 minutes to an hour in my experience. Much faster than trying to go through every row!
I'm guessing you are using SQL server? if so I would
a) make sure the table names are showing up in the export
b) look into the BulkCopy command. that will allow you to push an entire table in. so you can loop through the datatables and bulk copy each one in.
using (SqlBulkCopy copy = new SqlBulkCopy(MySQLExpConn))
{
copy.ColumnMappings.Add(0, 0);
copy.ColumnMappings.Add(1, 1);
copy.ColumnMappings.Add(2, 2);
copy.ColumnMappings.Add(3, 3);
copy.ColumnMappings.Add(4, 4);
copy.ColumnMappings.Add(5, 5);
copy.ColumnMappings.Add(6, 6);
copy.DestinationTableName = ds.Tables[i].TableName;
copy.WriteToServer(ds.Tables[i]);
}
You can use XML serializatin but you will need good ORML tool like NHibernation etc to help you with it. XML Serialization will maintain its data type and will work flowlessly.
You can read entire table and serialize all values into xml file, and you can read entire xml file back into list of objects and you can store them into database. Using good ORML tool you will not need to write any SQL. And I think they can work on different database servers as well.
I finally choose SqlCommandBuilder to build insert command automatically
See
SqlCommandBuilder Class