Fastest Get data from remote server - c#

I'm creating a windows application in which I need to get data using ado.net/(Or any other way using C# if any ). From one table. The database table apparently has around 100000 records and it takes forever to download.
Is there any faster way where I could get data into faster?
I tried the DataReader but still isn't fast enough.

The data-reader API is about the most direct you can do. The important thing is where is the time?
is it bandwidth in transferring the data?
or is it in the fundamental query?
You can find out by running the query locally on the machine, and see how long it takes. If bandwidth is your limit, then all you can really try is removing columns you don't actually need (don't do select *). Or pay for a fatter pipe between you and the server. In some cases, querying the data locally, and returning it in some compressed form might help - but then you're really talking about something like a web-service, which has other bandwidth considerations.
More likely, though, the problem is the query itself. Often, things like:
writing sensible tsql
adding an appropriate index
avoid cursors, complex processing, etc

You might want to implement a need to know basis method. Only pull down the first chunk of data that is needed and then when the next set is needed, pull those rows.

It's probably your query that is so slow not the streaming process. You should show us your sql query, then we could help you to improve it.
Assuming you want to get all 100000 records from you table, you could use a SqlDataAdapter to fill a DataTable or a SqlDataReader to fill a List<YourCustomClass>:
the DataTable approach (since i don't know your fields it's difficult to show a class):
var table = new DataTable();
const string sql = "SELECT * FROM dbo.YourTable ORDER BY SomeColumn";
using(var con = new SqlConnection(Properties.Settings.Default.ConnectionString))
using(var da = new SqlDataAdapter(sql, con))
{
da.Fill(table);
}

Related

Handle a lot of data in ASP-NET MVC

I'm starting to program with ASP-NET MVC an application with Angular for the front-end and SQL Sever to the database. In some cases, I have complex query than I have to use and I cannot modify because of a restriction business. I am using a structure similar to this one: using simple queries in ASP.NET MVC but I don´t know which is the correct way to handle a lot of data and show in the front-end.
I have a ViewModel with the data structure of the results query, a DomainModel where the query is located and the Controller to communicate with the front-end.
My problem is that I don´t know which would be the way to develop what I am trying. Now I´m trying to create as many objects in a list object as rows in my query, but when this method is running my computer gets blocked with no error showed (I can guess it is because it is using the whole memory).
Note that the table in the front has to show only 25 results per page, maybe I can execute the query always when the user choose a diferente page of the table, getting a different lots of results. I didn´t try this choice yet.
This is part of the DomainModel:
public IEnumerable<OperationView> GetOperations()
{
List<OperationView> Operationslist = new List<OperationView>();
using (SqlConnection connection = new SqlConnection(connectionString))
using (SqlCommand command = new SqlCommand("", connection))
{
command.CommandText = /*Query joining 8-10 tables*/;
connection.Open();
SqlDataReader reader = command.ExecuteReader();
while (reader.Read())
{
var OperationView = new OperationView();
OperationView.IdOperacion = reader["ID_OPERACION"].ToString();
//Loading here some other variables of OperationView
Operationslist.Add(OperationView);
}
connection.Close();
}
return Operationslist;
}
This is part of the Controller:
public IEnumerable<OperationView> GetOperaciones()
{
var Operation = new OperationPDomainModel();
return Operation.GetOperations();
}
I think that my front and ViewModel are not importants for this problem, but I can include them if needed.
Currently, if I try to execute the computer shuts down unexpectely...
As your system is going out of memory, you need to have pagination.
This paging is should be done in the database side. UI just need to pass the page index and number of records displayed per page.
So your query should be something as below
Select a,b,c, ROW_NUMBER() OVER(ORDER BY a) rnum from foo
where rnum between (25 * Page_Index) + 1 and (25 * Page_Index) + 25
There are a few improvements you could make.
Make the call async
The operation would hang as it blocks the main thread. If possible try this operation async. Use task-based programming to run the operation on a different thread. That should make things a little better not improve that significantly.
Use pagination
Get only the number of records that you need to display on the page. This should be the best improvement based on the code you have. It would also be better to have some more filters if possible. But getting only 25 records if you need only 25 should be the way to go.
It would also help if you could use modern programming techniques like EF and LINQ instead of traditional ADO.Net
Use Ajax
Such large processing should be done using AJAX calls. If you do not want the user to wait for the data to be loaded, you can load the page and make the data retrieval a part of a separate AJAX call.
check this View Millions Of Records
https://www.c-sharpcorner.com/article/how-to-scroll-and-view-millions-of-records/

The best way to manage records in a table

I am sorry that they ask this question has been asked many times but I still have not yet found the best answer.
I am worried applications take a long time to download the record or filter the records. Assuming I have a table called tbl_customer. And records in tbl_customer more than 10,000 rows.
The first question, I am using Data Grid View to display the records. Would be ideal if I download all the records up to 10,000 rows into the Data Grid View? Or perhaps I had better put the record row limit?
Second question, what is the best way to filter records in tbl_customer. Do we just need to query using SQL? or using LINQ? or maybe there is a better way?
For now, I only use this way:
DataTable dtCustomer = new DataTable();
using (SqlConnection conn = new SqlConnection(cs.connString))
{
string query = "SELECT customerName,customerAddress FROM tbl_customer WHERE customerAddress = '"+addressValue+"' ORDER BY customerName ASC;";
using (SqlDataAdapter adap = new SqlDataAdapter(query, conn))
{
adap.Fill(dtCustomer);
}
}
dgvListCustomer.DataSource = dtCustomer
Then I learn about LINQ so i do like this
DataTable dtCustomer = new DataTable();
using (SqlConnection conn = new SqlConnection(cs.connString))
{
string query = "SELECT * FROM tbl_customer ORDER BY customerName ASC;";
using (SqlDataAdapter adap = new SqlDataAdapter(query, conn))
{
adap.Fill(dtCustomer);
}
}
var resultCustomer = from row in dtCustomer.AsEnumerable()
where row.Field<string>("customerAddress") == addressValue
select new
{
customerName = row["customerName"].ToString(),
customerAddress = row2["customerAddress"].ToString(),
};
dgvListCustomer.DataSource = resultCustomer;
Workflow SQL> DATATABLE> LINQ > DataGridView is suitable to filter records? Or if there are better suggestions are most welcome.
Thanks you..:)
I am worried applications take a long time to download the record or filter the records.
Welcome - you seem to live in a world like me where performance ms measured in milliseconds, and yes, on a low power server it will take likely more than a millisecond (0.001 seconds) to hot load and filter 10.000 rows.
As such, my advice is not to put that database on a tablet or mobile phone but to use at least a decent desktop level compute r or VM for the database server.
As a hint: I am regularly making queries on a billion row table and it is fast. Anything below a million rows is a joke these days - in fact it was nothing worth mentioning when I started with databases more than 15 years ago. You are the guy asking whether it is better to have a ferrari or a porsche becauese you are concerned whether any of those case goes more than 20km/h.
Would be ideal if I download all the records up to 10,000 rows into the Data Grid View?
In order to get fired? Yes. Old rule with databases: never load more data than you have to, especially when you have no clue. Forget the SQL side - you will get UI problems with 10.000 rows and more, especially usability issues.
Do we just need to query using SQL? or using LINQ?
Hint: Linq is also using SQL under the hood. The question is more - how much time do you want to spend writing boring repetitive code for handwritten SQL like in your examples? Espeically given that you also do "smart" things like referencing fields by name, not ordinal, and asking for "select *" instead of a field list, bot obvious beginner mistakes.
What you should definitely not do - but you do - is using a DataTable. Get a decent book about programming databases. RTFSM may help - both LINQ (which I am not sure what you mean - LINQ is a language for the compiler, you need an implementor, so that could be NHibernate, Entity Framework, Linq2Sql, BlToolkit, to name just a FEW tha t go from a LINQ query to a sql statement).
Workflow SQL> DATATABLE> LINQ > DataGridView is suitable to filter records?
A Ferrari is also suitable to transport 20 tons of coal from A to B - just the worst possible car for it. GSour stack is likely the worst I have seen, but it is suuitable in that you CAN do it - slow, lots f mmemoory use, but you will get a result and hopefully fired. You pull the data from a high performance database into a data table, then use a non integrating technology (LINQ) to filter (not using the indices in the data table) to go into yet another layer.
Just to give you an idea - this would get you removed from quite some "beginning programming" courses.
What about:
LINQ
Point.
Pulls a collection of business objects that go to the UI. Period.
Read at least some of the sample code for the technologies you use.

"cursor like" reading inside a CLR procedure/function

I have to implement an algorithm on data which is (for good reasons) stored inside SQL server. The algorithm does not fit SQL very well, so I would like to implement it as a CLR function or procedure. Here's what I want to do:
Execute several queries (usually 20-50, but up to 100-200) which all have the form select a,b,... from some_table order by xyz. There's an index which fits that query, so the result should be available more or less without any calculation.
Consume the results step by step. The exact stepping depends on the results, so it's not exactly predictable.
Aggregate some result by stepping over the results. I will only consume the first parts of the results, but cannot predict how much I will need. The stop criteria depends on some threshold inside the algorithm.
My idea was to open several SqlDataReader, but I have two problems with that solution:
You can have only one SqlDataReader per connection and inside a CLR method I have only one connection - as far as I understand.
I don't know how to tell SqlDataReader how to read data in chunks. I could not find documentation how SqlDataReader is supposed to behave. As far as I understand, it's preparing the whole result set and would load the whole result into memory. Even if I would consume only a small part of it.
Any hint how to solve that as a CLR method? Or is there a more low level interface to SQL server which is more suitable for my problem?
Update: I should have made two points more explicit:
I'm talking about big data sets, so a query might result in 1 mio records, but my algorithm would consume only the first 100-200 ones. But as I said before: I don't know the exact number beforehand.
I'm aware that SQL might not be the best choice for that kind of algorithm. But due to other constraints it has to be a SQL server. So I'm looking for the best possible solution.
SqlDataReader does not read the whole dataset, you are confusing it with the Dataset class. It reads row by row, as the .Read() method is being called. If a client does not consume the resultset the server will suspend the query execution because it has no room to write the output into (the selected rows). Execution will resume as the client consumes more rows (SqlDataReader.Read is being called). There is even a special command behavior flag SequentialAccess that instructs the ADO.Net not to pre-load in memory the entire row, useful for accessing large BLOB columns in a streaming fashion (see Download and Upload images from SQL Server via ASP.Net MVC for a practical example).
You can have multiple active result sets (SqlDataReader) active on a single connection when MARS is active. However, MARS is incompatible with SQLCLR context connections.
So you can create a CLR streaming TVF to do some of what you need in CLR, but only if you have one single SQL query source. Multiple queries it would require you to abandon the context connection and use isntead a fully fledged connection, ie. connect back to the same instance in a loopback, and this would allow MARS and thus consume multiple resultsets. But loopback has its own issues as it breaks the transaction boundaries you have from context connection. Specifically with a loopback connection your TVF won't be able to read the changes made by the same transaction that called the TVF, because is a different transaction on a different connection.
SQL is designed to work against huge data sets, and is extremely powerful. With set based logic it's often unnecessary to iterate over the data to perform operations, and there are a number of built-in ways to do this within SQL itself.
1) write set based logic to update the data without cursors
2) use deterministic User Defined Functions with set based logic (you can do this with the SqlFunction attribute in CLR code). Non-Deterministic will have the affect of turning the query into a cursor internally, it means the value output is not always the same given the same input.
[SqlFunction(IsDeterministic = true, IsPrecise = true)]
public static int algorithm(int value1, int value2)
{
int value3 = ... ;
return value3;
}
3) use cursors as a last resort. This is a powerful way to execute logic per row on the database but has a performance impact. It appears from this article CLR can out perform SQL cursors (thanks Martin).
I saw your comment that the complexity of using set based logic was too much. Can you provide an example? There are many SQL ways to solve complex problems - CTE, Views, partitioning etc.
Of course you may well be right in your approach, and I don't know what you are trying to do, but my gut says leverage the tools of SQL. Spawning multiple readers isn't the right way to approach the database implementation. It may well be that you need multiple threads calling into a SP to run concurrent processing, but don't do this inside the CLR.
To answer your question, with CLR implementations (and IDataReader) you don't really need to page results in chunks because you are not loading data into memory or transporting data over the network. IDataReader gives you access to the data stream row-by-row. By the sounds it your algorithm determines the amount of records that need updating, so when this happens simply stop calling Read() and end at that point.
SqlMetaData[] columns = new SqlMetaData[3];
columns[0] = new SqlMetaData("Value1", SqlDbType.Int);
columns[1] = new SqlMetaData("Value2", SqlDbType.Int);
columns[2] = new SqlMetaData("Value3", SqlDbType.Int);
SqlDataRecord record = new SqlDataRecord(columns);
SqlContext.Pipe.SendResultsStart(record);
SqlDataReader reader = comm.ExecuteReader();
bool flag = true;
while (reader.Read() && flag)
{
int value1 = Convert.ToInt32(reader[0]);
int value2 = Convert.ToInt32(reader[1]);
// some algorithm
int newValue = ...;
reader.SetInt32(3, newValue);
SqlContext.Pipe.SendResultsRow(record);
// keep going?
flag = newValue < 100;
}
Cursors are a SQL only function. If you wanted to read chunks of data at a time, some sort of paging would be required so that only a certain amount of the records would be returned. If using Linq,
.Skip(Skip)
.Take(PageSize)
Skips and takes could be used to limit results returned.
You can simply iterate over the DataReader by doing something like this:
using (IDataReader reader = Command.ExecuteReader())
{
while (reader.Read())
{
//Do something with this record
}
}
This would be iterating over the results one at a time, similiar to a cursor in SQL Server.
For multiple recordsets at once, try MARS
(if SQL Server)
http://msdn.microsoft.com/en-us/library/ms131686.aspx

Very slow running queries across local network in c# apps

I've been developing some small database applications in Visual Studio C# for a while now. I am currently using VS 2010. Up until recently all the apps were ran on the same computer that the database was stored on and everything ran great. Recently I had to start to developing some apps that will run on a separate computer that is on the same local network.
Easy enough, but I run into a problem when running queries to fill controls, such as a grid or even combo box. The problem is that it can take 15-30 seconds per control if my query is pulling a large amount of data. I know this is because the app is sending out my select query, waiting for all of the results to come across the network and then displaying the information. The problem is I don't know what to do about it.
Below I have a code snippet(slightly modified to make more sense). It is using a Firebird database, though I use MSSQL and Sybase Advantage as well with the same results.
FbConnection fdbConnect = new FbConnection();
fdbConnect.ConnectionString = Program.ConnectionString;
fdbConnect.Open();
FbCommand fcmdQuery = new FbCommand();
fcmdQuery.Connection = fdbConnect;
fcmdQuery.CommandText = "select dadda.name, yadda.address, yadda.phone1 from SOMETABLE left join yadda on dadda where yadda.pk = dadda.yaddapk";
FbDataAdapter fdaDataSet = new FbDataAdapter(fcmdQuery);
DataSet dsReturn = new DataSet();
fdaDataSet.Fill(dsReturn);
fdbConnect.Close();
DataGridView1.DataSource = dsReturn.Tables[0];
Does anyone have any suggestions on how I can speed this up?
You may returning unnecessary data in that SELECT * statement. It can be wasteful in network traffic and drive down the performance of your application. There are many articles about this and how you should specify your columns explicitly. Here is one in particular.
You can reduce the volume of the response by restricting your columns:
Instead of
select * from SOMETABLE
Try
select a,b,c from SOMETABLE
to retrieve only the data you need.
Your mileage vary depending on what the table contains. If there are unused blob columns for instance, you are adding a considerable overhead to your response.
If you are displaying the data in gridview, and if data is huge, its better to do server side paging so that a specific number of rows is returned at a time.

SqlDataAdapter vs SqlDataReader

What are the differences between using SqlDataAdapter vs SqlDataReader for getting data from a DB?
I am specifically looking into their Pros and Cons as well as their speed and memory performances.
Thanks
DataReader:
Needs the connection held open until you are finished (don't forget to close it!).
Can typically only be iterated over once
Is not as useful for updating back to the database
On the other hand, it:
Only has one record in memory at a time rather than an entire result set (this can be HUGE)
Is about as fast as you can get for that one iteration
Allows you start processing results sooner (once the first record is available). For some query types this can also be a very big deal.
DataAdapter/DataSet
Lets you close the connection as soon it's done loading data, and may even close it for you automatically
All of the results are available in memory
You can iterate over it as many times as you need, or even look up a specific record by index
Has some built-in faculties for updating back to the database
At the cost of:
Much higher memory use
You wait until all the data is loaded before using any of it
So really it depends on what you're doing, but I tend to prefer a DataReader until I need something that's only supported by a dataset. SqlDataReader is perfect for the common data access case of binding to a read-only grid.
For more info, see the official Microsoft documentation.
The answer to that can be quite broad.
Essentially, the major difference for me that usually influences my decisions on which to use is that with a SQLDataReader, you are "streaming" data from the database. With a SQLDataAdapter, you are extracting the data from the database into an object that can itself be queried further, as well as performing CRUD operations on.
Obviously with a stream of data SQLDataReader is MUCH faster, but you can only process one record at a time. With a SQLDataAdapter, you have a complete collection of the matching rows to your query from the database to work with/pass through your code.
WARNING: If you are using a SQLDataReader, ALWAYS, ALWAYS, ALWAYS make sure that you write proper code to close the connection since you are keeping the connection open with the SQLDataReader. Failure to do this, or proper error handling to close the connection in case of an error in processing the results will CRIPPLE your application with connection leaks.
Pardon my VB, but this is the minimum amount of code you should have when using a SqlDataReader:
Using cn As New SqlConnection("..."), _
cmd As New SqlCommand("...", cn)
cn.Open()
Using rdr As SqlDataReader = cmd.ExecuteReader()
While rdr.Read()
''# ...
End While
End Using
End Using
equivalent C#:
using (var cn = new SqlConnection("..."))
using (var cmd = new SqlCommand("..."))
{
cn.Open();
using(var rdr = cmd.ExecuteReader())
{
while(rdr.Read())
{
//...
}
}
}
A SqlDataAdapter is typically used to fill a DataSet or DataTable and so you will have access to the data after your connection has been closed (disconnected access).
The SqlDataReader is a fast forward-only and connected cursor which tends to be generally quicker than filling a DataSet/DataTable.
Furthermore, with a SqlDataReader, you deal with your data one record at a time, and don't hold any data in memory. Obviously with a DataTable or DataSet, you do have a memory allocation overhead.
If you don't need to keep your data in memory, so for rendering stuff only, go for the SqlDataReader. If you want to deal with your data in a disconnected fashion choose the DataAdapter to fill either a DataSet or DataTable.
Use an SqlDataAdapter when wanting to populate an in-memory DataSet/DataTable from the database. You then have the flexibility to close/dispose off the connection, pass the datatable/set around in memory. You could then manipulate the data and persist it back into the DB using the data adapter, in conjunction with InsertCommand/UpdateCommand.
Use an SqlDataReader when wanting fast, low-memory footprint data access without the need for flexibility for e.g. passing the data around your business logic. This is more optimal for quick, low-memory usage retrieval of large data volumes as it doesn't load all the data into memory all in one go - with the SqlDataAdapter approach, the DataSet/DataTable would be filled with all the data so if there's a lot of rows & columns, that will require a lot of memory to hold.
The Fill function uses a DataReader internally. If your consideration is "Which one is more efficient?", then using a DataReader in a tight loop that populates a collection record-by-record, is likely to be the same load on the system as using DataAdapter.Fill.
(System.Data.dll, System.Data.Common.DbDataAdapter, FillInternal.)

Categories

Resources