Fastest way to read data from MySQL using C# - c#

I'm wondering if there is a faster method than using something like:
while (Reader.Read())
to read the results of mysql select queries.
I'm randomly pulling 10,000 rows from a database and would like to read it as quickly as possible. Is there a way to serialize the results if we know what they are (such as using the metadata to setup a structure)?

Try MySQLDataAdapter.Fill method to fill any DataTable object - read speed is comparable to optimal usage of read data with Read method (depends on your while block reading way) and main advantage is that you achieve prepared data collection which you can manage or just write to XML file.

Related

SqlBulkCopy with multiple files

There are many examples on the use of the SqlBulkCopy Class in System.Data.SqlClient but only in relation to a single file. What I would like to know is how you use it when you have multiple files. I have read that it should only be used once but how do you achieve that ?. Can someone give me example of how to use SqlBulkCopy with multiple files
If you are a novice you probably should call SqlBulkCopy in a loop and be done with it.
If you are advanced and have higher perf requirements you should pass in an IEnumerable<SqlDataRecord> that is a single combined stream that has the data of all files. That way you have the ability to insert a single stream of rows. This can be more efficient than many smaller inserts because the query processor can sort all rows and insert them sequentially. There are also concerns around minimal logging. Sometimes an empty destination table is required.

Update table : Dataset or LINQ

I am trying to perform insert/update/delete operations on a SQL table based on the input csv file which is loaded into data table from an web application. Currently, I am using DataSet to do CRUD operations but would like to know if there will be any advantages of using LINQ over DataSet. I am assuming code will be reduced and more strongly typed but not sure if I need to switch to LINQ. Any inputs appreciated.
Edit
It is not a bulk operation, CSV might contain 200 records max.
I used the LumenWorks CSV reader which is very fast. It has it's own API for extracting data, using the IDataReader interface. Here is a brief example taken from codeplex.com. I use it for all my CSV projects, as it's very fast at reading CSV data. I was surprised at how fast it actually was.
If you were to go from a reader like this, you're essentially going from a data reader API and as such, would probably work with a data table more easily (you could create a DataTable matching the result set and easily copy data over matching column to column).
A lot of updates can be slower with LINQ, depending on whether you are using Entity Framework or something else, and what flavor you are using. A DataTable, IMHO would probably be faster. I had issues with LINQ and change tracking with a lot of objects (if you are using attached entities, not using POCOs). I've had pretty good performance taking a CSV file from Lumenworks and copying it to a DataTable.

What is the fastest way to save data in SQL Server using C#?

I am currently working on a small .NET app in C#, that fetches data through some web service.
The data is represented in objects, so it would have been logical to store the data in a document based database, but there is a demand to use SQL Server.
So what might be the fastest way to insert many thousands, perhaps millions of rows into a database.
I an open to any framework, that might could support that, but I haven't been able to find any benchmarking on this e.g. on Entity Framework.
To iterate over the data an do an insert per row is simply to slow, then it would be quicker to dump the data in a file, and then do a bulk import using SSIS, but for this scenario I would rather avoid that, and keep all logic in the C# app.
You might want to use the SqlBulkCopy class. It is quite efficient for large data.

C# Importing Large Volume of Data from CSV to Database

What's the most efficient method to load large volumes of data from CSV (3 million + rows) to a database.
The data needs to be formatted(e.g. name column needs to be split into first name and last name, etc.)
I need to do this in a efficiently as possible i.e. time constraints
I am siding with the option of reading, transforming and loading the data using a C# application row-by-row? Is this ideal, if not, what are my options? Should I use multithreading?
You will be I/O bound, so multithreading will not necessarily make it run any faster.
Last time I did this, it was about a dozen lines of C#. In one thread it ran the hard disk as fast as it could read data from the platters. I read one line at a time from the source file.
If you're not keen on writing it yourself, you could try the FileHelpers libraries. You might also want to have a look at Sébastien Lorion's work. His CSV reader is written specifically to deal with performance issues.
You could use the csvreader to quickly read the CSV.
Assuming you're using SQL Server, you use csvreader's CachedCsvReader to read the data into a DataTable which you can use with SqlBulkCopy to load into SQL Server.
I would agree with your solution. Reading the file one line at a time should avoid the overhead of reading the whole file into memory at once, which should make the application run quickly and efficiently, primarily taking time to read from the file (which is relatively quick) and parse the lines. The one note of caution I have for you is to watch out if you have embedded newlines in your CSV. I don't know if the specific CSV format you're using might actually output newlines between quotes in the data, but that could confuse this algorithm, of course.
Also, I would suggest batching the insert statements (include many insert statements in one string) before sending them to the database if this doesn't present problems in retrieving generated key values that you need to use for subsequent foreign keys (hopefully you don't need to retrieve any generated key values). Keep in mind that SQL Server (if that's what you're using) can only handle 2200 parameters per batch, so limit your batch size to account for that. And I would recommend using parameterized TSQL statements to perform the inserts. I suspect more time will be spent inserting records than reading them from the file.
You don't state which database you're using, but given the language you mention is C# I'm going to assume SQL Server.
If the data can't be imported using BCP (which it sounds like it can't if it needs significant processing) then SSIS is likely to be the next fastest option. It's not the nicest development platform in the world, but it is extremely fast. Certainly faster than any application you could write yourself in any reasonable timeframe.
BCP is pretty quick so I'd use that for loading the data. For string manipulation I'd go with a CLR function on SQL once the data is there. Multi-threading won't help in this scenario except to add complexity and hurt performance.
read the contents of the CSV file line by line into a in memory DataTable. You can manipulate the data (ie: split the first name and last name) etc as the DataTable is being populated.
Once the CSV data has been loaded in memory then use SqlBulkCopy to send the data to the database.
See http://msdn.microsoft.com/en-us/library/system.data.sqlclient.sqlbulkcopy.writetoserver.aspx for the documentation.
If you really want to do it in C#, create & populate a DataTable, truncate the target db table, then use System.Data.SqlClient.SqlBulkCopy.WriteToServer(DataTable dt).

How to best insert 350,000 rows with ADO.Net

I have a csv file with 350,000 rows, each row has about 150 columns.
What would be the best way to insert these rows into SQL Server using ADO.Net?
The way I've usually done it is to create the SQL statement manually. I was wondering if there is any way I can code it to simply insert the entire datatable into SQL Server? Or some short-cut like this.
By the way I already tried doing this with SSIS, but there are a few data clean-up issues which I can handle with C# but not so easily with SSIS. The data started as XML, but I changed it to CSV for simplicity.
Make a class "CsvDataReader" that implements IDataReader. Just implement Read(), GetValue(int i), Dispose() and the constructor : you can leave the rest throwing NotImplementedException if you want, because SqlBulkCopy won't call them. Use read to handle the read of each line and GetValue to read the i'th value in the line.
Then pass it to the SqlBulkCopy with the appropriate column mappings you want.
I get about 30000 records/per sec insert speed with that method.
If you have control of the source file format, make it tab delimited as it's easier to parse than CSV.
Edit : http://www.codeproject.com/KB/database/CsvReader.aspx - tx Mark Gravell.
SqlBulkCopy if it's available. Here is a very helpful explanation of using SqlBulkCopy in ADO.NET 2.0 with C#
I think you can load your XML directly into a DataSet and then map your SqlBulkCopy to the database and the DataSet.
Hey you should revert back to XML instead of csv, then load that xml file in a temp table using openxml, clean up your data in temp table and then finally process this data.
I have been following this approach for huge data imports where my XML files happen to be > 500 mb in size and openxml works like a charm.
You would be surprised at how much faster this would work compared to manual ado.net statements.

Categories

Resources