The best way to manage records in a table - c#

I am sorry that they ask this question has been asked many times but I still have not yet found the best answer.
I am worried applications take a long time to download the record or filter the records. Assuming I have a table called tbl_customer. And records in tbl_customer more than 10,000 rows.
The first question, I am using Data Grid View to display the records. Would be ideal if I download all the records up to 10,000 rows into the Data Grid View? Or perhaps I had better put the record row limit?
Second question, what is the best way to filter records in tbl_customer. Do we just need to query using SQL? or using LINQ? or maybe there is a better way?
For now, I only use this way:
DataTable dtCustomer = new DataTable();
using (SqlConnection conn = new SqlConnection(cs.connString))
{
string query = "SELECT customerName,customerAddress FROM tbl_customer WHERE customerAddress = '"+addressValue+"' ORDER BY customerName ASC;";
using (SqlDataAdapter adap = new SqlDataAdapter(query, conn))
{
adap.Fill(dtCustomer);
}
}
dgvListCustomer.DataSource = dtCustomer
Then I learn about LINQ so i do like this
DataTable dtCustomer = new DataTable();
using (SqlConnection conn = new SqlConnection(cs.connString))
{
string query = "SELECT * FROM tbl_customer ORDER BY customerName ASC;";
using (SqlDataAdapter adap = new SqlDataAdapter(query, conn))
{
adap.Fill(dtCustomer);
}
}
var resultCustomer = from row in dtCustomer.AsEnumerable()
where row.Field<string>("customerAddress") == addressValue
select new
{
customerName = row["customerName"].ToString(),
customerAddress = row2["customerAddress"].ToString(),
};
dgvListCustomer.DataSource = resultCustomer;
Workflow SQL> DATATABLE> LINQ > DataGridView is suitable to filter records? Or if there are better suggestions are most welcome.
Thanks you..:)

I am worried applications take a long time to download the record or filter the records.
Welcome - you seem to live in a world like me where performance ms measured in milliseconds, and yes, on a low power server it will take likely more than a millisecond (0.001 seconds) to hot load and filter 10.000 rows.
As such, my advice is not to put that database on a tablet or mobile phone but to use at least a decent desktop level compute r or VM for the database server.
As a hint: I am regularly making queries on a billion row table and it is fast. Anything below a million rows is a joke these days - in fact it was nothing worth mentioning when I started with databases more than 15 years ago. You are the guy asking whether it is better to have a ferrari or a porsche becauese you are concerned whether any of those case goes more than 20km/h.
Would be ideal if I download all the records up to 10,000 rows into the Data Grid View?
In order to get fired? Yes. Old rule with databases: never load more data than you have to, especially when you have no clue. Forget the SQL side - you will get UI problems with 10.000 rows and more, especially usability issues.
Do we just need to query using SQL? or using LINQ?
Hint: Linq is also using SQL under the hood. The question is more - how much time do you want to spend writing boring repetitive code for handwritten SQL like in your examples? Espeically given that you also do "smart" things like referencing fields by name, not ordinal, and asking for "select *" instead of a field list, bot obvious beginner mistakes.
What you should definitely not do - but you do - is using a DataTable. Get a decent book about programming databases. RTFSM may help - both LINQ (which I am not sure what you mean - LINQ is a language for the compiler, you need an implementor, so that could be NHibernate, Entity Framework, Linq2Sql, BlToolkit, to name just a FEW tha t go from a LINQ query to a sql statement).
Workflow SQL> DATATABLE> LINQ > DataGridView is suitable to filter records?
A Ferrari is also suitable to transport 20 tons of coal from A to B - just the worst possible car for it. GSour stack is likely the worst I have seen, but it is suuitable in that you CAN do it - slow, lots f mmemoory use, but you will get a result and hopefully fired. You pull the data from a high performance database into a data table, then use a non integrating technology (LINQ) to filter (not using the indices in the data table) to go into yet another layer.
Just to give you an idea - this would get you removed from quite some "beginning programming" courses.
What about:
LINQ
Point.
Pulls a collection of business objects that go to the UI. Period.
Read at least some of the sample code for the technologies you use.

Related

Migrating big data to new database

I'd like to transfer a large amount of data from SQL Server to MongoDB (Around 80 million records) using a solution I wrote in C#.
I want to transfer say 200 000 records at a time, but my problem is keeping track of what has already been transferred. Normally I'd do it as follows:
Gather IDs from destination to exclude from source scope
Read from source (Excluding IDs already in destination)
Write to destination
Repeat
The problem is that I build a string in C# containing all the IDs that exist in the destination, for the purpose of excluding those from source selection, eg.
select * from source_table where id not in (<My large list of IDs>)
Now you can imagine what happens here when I have already inserted 600 000+ records and then build a string with all the IDs, it gets large and slows things down even more, so I'm looking for a way to iterate through say 200 000 records at a time, like a cursor, but I have never done something like this and so I am here, looking for advice.
Just as a reference, I do my reads as follows
SqlConnection conn = new SqlConnection(myConnStr);
conn.Open();
SqlCommand cmd = new SqlCommand("select * from mytable where id not in ("+bigListOfIDs+")", conn);
SqlDataReader reader = cmd.ExecuteReader();
if (reader.HasRows)
{
while (reader.Read())
{
//Populate objects for insertion into MongoDB
}
}
So basically, I want to know how to iterate through large amounts of data without selecting all that data in one go, or having to filter the data using large strings. Any help would be appreciated.
Need more rep to comment, but if you sort by your id column you could change your where clause to become
select * from source_table where *lastusedid* < id and id <= *lastusedid+200000*
which will give you the range of 200000 you asked for and you only need to store the single integer
There are many different ways of doing this, but I would suggest first that you don't try to reinvent the wheel but look at existing programs.
There are many programs designed to export and import data between different databases, some are very flexible and expensive, but others come with free options and most DBMS programs include something.
Option 1:
Use SQL Server Management Studio (SSMS) Export wizards.
This allows you to export to different sources. You can even write complex queries if required. More information here:
https://www.mssqltips.com/sqlservertutorial/202/simple-way-to-export-data-from-sql-server/
Option 2:
Export your data in ascending ID order.
Store the last exported ID in a table.
Export the next set of data where ID > lastExportedID
Option 3:
Create a copy of your data in a back-up table.
Export from this table, and delete the records as you export them.

Sorting inside the database or Sorting in code behind? Which is best?

I have a dropdown list in my aspx page. Dropdown list's datasource is a datatable. Backend is MySQL and records get to the datatable by using a stored procedure.
I want to display records in the dropdown menu in ascending order.
I can achieve this by two ways.
1) dt is datatable and I am using dataview to filter records.
dt = objTest_BLL.Get_Names();
dataView = dt.DefaultView;
dataView.Sort = "name ASC";
dt = dataView.ToTable();
ddown.DataSource = dt;
ddown.DataTextField = dt.Columns[1].ToString();
ddown.DataValueField = dt.Columns[0].ToString();
ddown.DataBind();
2) Or in the select query I can simply say that
SELECT
`id`,
`name`
FROM `test`.`type_names`
ORDER BY `name` ASC ;
If I use 2nd method I can simply eliminate the dataview part. Assume this type_names table has 50 records. And my page is view by 100,000 users at a minute. What is the best method by considering efficiency,Memory handling? Get unsorted records to datatable and filter in code behind or sort them inside the datatabse?
Note - Only real performance tests can tell you real numbers.. Theoretical options are below (which is why I use word guess a lot in this answer).
You have at least 3 (instead of 2) options -
Sort in database - If the column being sorted on is indexed.. Then this may make most sense, because overhead of sorting on your database server may be negligible. SQL servers own data caches may make this super fast operation.. but 100k queries per minute.. measure if SQL gives noticeably faster results without sort.
Sort in code behind / middle layer - Likely you won't have your own equivalent of index.. you'd be sorting list of 50 records, 100k times per minutes.. would be slower than SQL, I would guess.
Big benefit would apply, only if data is relatively static, or very slow changing, and sorted values can be cached in memory for few seconds to minutes or hours..
The option not in your list - send the data unsorted all the way to the client, and sort it on client side using javascript. This solution may scale the most... sorting 50 records in Browser, should not be a noticeable impact on your UX.
The SQL purists will no doubt tell you that it’s better to let SQL do the sorting rather than C#. That said, unless you are dealing with massive record sets or doing many queries per second it’s unlikely you’d notice any real difference.
For my own projects, these days I tend to do the sorting on C# unless I’m running some sort of aggregate on the statement. The reason is that it’s quick, and if you are running any sort of stored proc or function on the SQL server it means you don’t need to find ways of passing order by’s into the stored proc.

Add multiple blank rows to datatable in a single shot(How sql inserts data faster than looping)

How to add multiple blank rows in datatable? My situation is like this. I have an sql query which makes a dataset as a result. It have 4 tables to join and about 100000 rows to shown. While showing result to user i need to embeded some fields conditionaly to the datatable. So I'm creating a new datatable with my required fields, loops through the first, creates new row on second. That i'm using for datasource of my grid.
But i could see the sql runs its query and makes resultset more faster than my loop(It processes query and create rows in RAM) very much faster than my simple loop(Only create rows). Why its like this? How can i improve my speed? I suspected the conditions checking, So i removed them. Still i'm getting same result.
Sample code
The query is so big and if i'm sharing that its against company policy. So i'll show some parts of the code which will explain situation
select ItemCode,Sku from tblStock ts join tblItemGroup im on ts.ItemCode=im.ItemCode
is taking to a dataset dsItems
Creates table dtGridItems
like
dtGridItems = new DataTable();
dtGridItems.Columns.Add("ImgPath", typeof(String));
dtGridItems.Columns.Add("ItemCode", typeof(String));
dtGridItems.Columns.Add("sku", typeof(Long));
Then fills that datatable like
foreach (DataRow drItem in dsItems.Tables[0].Rows)
{
DataRow drGridItem = dtGridItems.NewRow();
if(drGridItem["ItemCode"]=="SHIRTS"){
drGridItem["ImgPath"] = Shirts path;
}else if(drGridItem["ItemCode"]=="Pants"){
drGridItem["ImgPath"] = Pants path;
}
drGridItem["ItemCode"] = drItem["ItemCode"];
drGridItem["sku"] = drItem["sku"];
dtGridItems.Rows.Add(drGridItem);
}
this was the way i was using. Surely image was not the column i was using. But if i need to explain the actual code i need to explain a big part of our software. Then only i can explain why the requirement came here.
Second Edit-----
Since question is still not clear
Sorry if my question is not clear. I'm not trying to copy one table to another table. I'm trying to fetch table to dataset in .net. My question is not the efficiency of sql server. My question is how datatable(in .net) filling works faster than my manual filling in dataset. Is there a way to create 'n' rows in datatable in one shot and fill them by loop?Can i set multiple column data in a single shot in datarow?
The quick SQL-Only way to insert multiple rows into one table from another is something like.
SELECT Field1, Field2, Field3
INTO Table2(Field1, Field2, Field3)
FROM Table1
WHERE ...
Note that the query generating the data can be as complex as you like, you just need to add the INTO Table(Field, Field) line to pipe the results into a table instead of back to the client.
Doing this from .Net using a raw SQL command should take the same length of time to execute (plus the trivial overhead of sending the command to the server and parsing a response).
Of course, manually looping through anything client-side and performing an operation-per-item is going to be considerably slower - now you've got that same overhead for every row.
ORMs are also traditionally quite bad at doing bulk changes to data as they tend to check the state of each object and issue individual update statements (same as in the loop except often with more processing per item).
There are some ways to reduce the overhead (connection pooling being a prime candidate) but you're only mitigating the problem.
In short, if you want SQL to do all the processing in one go, you need to send it all the information it needs to do the processing in one go...
Edit
I don't know all the optimizations used by the framework but one of them is called Lazy Loading. It populates data when it's used, not when the request is defined. More details.
In short, instead of loading all the data, it populates the object with references to what data should be retrieved.
Eg instead of a rows like...
DataSet = //SELECT Firstname, Lastname, Age from Contacts
Firstname = John, Lastname = Smith, Age = 32
Firstname = Mike, Lastname = Jones, Age = 18
etc...
it populates them with:
Firstname = (SELECT Firstname, Lastname, Age from Contacts)[0].Firstname,
Lastname = (SELECT Firstname, Lastname, Age from Contacts)[0].Lastname,
Age = (SELECT Firstname, Lastname, Age from Contacts)[0].Age
Then, when you actually use the property...
String Name = String.format("{0} {1}", Data[0]['Firstname'], Data[0]['Lastname']);
It retrieves the specific values required. In the example above, it never bothers reading the age from the database.
Have a look at the IQueryable interface and how it's used to implement lazy loading.

Populating multiple tables in typed dataset

I'll try to simplify my problem as much, as possible.
Feel free to comment and correct my English. Hope you can understand me.
My main question is:
Is there any simple and "automated" way, to fill a table in dataset only with rows related to data in other table?
Let's say, we have database with following schema:
Now, i'm trying to make the same thing with table "Orders", and create custom method "FillByDate". It works, but there is small problem:
DataSet1 myDataSetInstance = new DataSet1();
DataSet1TableAdapters.OrdersTableAdapter OrdersTA = new DataSet1TableAdapters.OrdersTableAdapter();
OrdersTA.FillByDate(myDataSetInstance.Orders, new DateTime(2013, 1,1), DateTime.Now);
foreach (var row in myDataSetInstance.Orders)
{
MessageBox.Show(row.Comments); // OK
MessageBox.Show(row.CustomersRow.Name); //NULL
}
Getting related row from Customers table is impossible - first i have to manually fill that table. I can see two ways to do this
Getting whole content of this table - but it will be A LOT of unneeded data
Create custom query in it's TableAdapter - something like FillByOrdersByDate(#Date1, #Date2) - it is easy, when I have only 2 tables and 1 relation, but with more tables this method will require dozens of custom queries for each TableAdapter.
I really believe, that there have to be "better" way to do this.
Couple of ways to approach this - if you are only going to read the data, you can use a join query to populate the dataset.
Alternatively, you can use a join query to populate the child table. Looking at your example, suppose you wanted to list customers and orders for all customers in a particular city. You have already written a 'FillbyCity' query for your Customers TA - you would write a similar FillbyCity Query for your Orders ta. Yes you could use a join to the customers table to do this: SELECT Orders.* FROM Orders INNER JOIN customers ON customers.customerid = orders.customerid WHERE customers.city = #city
You would then use the datarelation to link individual customers to their orders, depending on the requirements of your application.
(If perchance you have David Sceppa's 'Programming ADO.Net 2.0' this is dealt with in chapter 7)
"but with more tables this method will require dozens of custom queries for each TableAdapter." Why dozens? I'm not sure where you're going with that.
(PS your English is fine, apart from mixing up it's and its - but lots of native speakers do that too. . )
There is an obscure... don't even know what to call it - ADO.NET SQL extension, or something - a command called SHAPE, that describes the relations you're looking for and ADO.NET uses that "special" SQL to give you a dataset that nicely contains multiple related tables.
SHAPE {select * from customers}
APPEND ({select * from orders} AS rsOrders
RELATE customerid TO customerid)
It works beautifully, but I think it's old and scarcely (un)supported
MS suggest that SHAPE provider is to be depreciated and to use XML instead (sorry - lost the link, but that was back in .NET 1.1) and they were pointing towards XML. I think the FOR XML T-SQL clause is doing the trick. I haven't done it myself (yet), where I use FOR XML to populate a DataSet, but if you follow the link in an answer I left to another similar question, I think it's going to work.

Fastest Get data from remote server

I'm creating a windows application in which I need to get data using ado.net/(Or any other way using C# if any ). From one table. The database table apparently has around 100000 records and it takes forever to download.
Is there any faster way where I could get data into faster?
I tried the DataReader but still isn't fast enough.
The data-reader API is about the most direct you can do. The important thing is where is the time?
is it bandwidth in transferring the data?
or is it in the fundamental query?
You can find out by running the query locally on the machine, and see how long it takes. If bandwidth is your limit, then all you can really try is removing columns you don't actually need (don't do select *). Or pay for a fatter pipe between you and the server. In some cases, querying the data locally, and returning it in some compressed form might help - but then you're really talking about something like a web-service, which has other bandwidth considerations.
More likely, though, the problem is the query itself. Often, things like:
writing sensible tsql
adding an appropriate index
avoid cursors, complex processing, etc
You might want to implement a need to know basis method. Only pull down the first chunk of data that is needed and then when the next set is needed, pull those rows.
It's probably your query that is so slow not the streaming process. You should show us your sql query, then we could help you to improve it.
Assuming you want to get all 100000 records from you table, you could use a SqlDataAdapter to fill a DataTable or a SqlDataReader to fill a List<YourCustomClass>:
the DataTable approach (since i don't know your fields it's difficult to show a class):
var table = new DataTable();
const string sql = "SELECT * FROM dbo.YourTable ORDER BY SomeColumn";
using(var con = new SqlConnection(Properties.Settings.Default.ConnectionString))
using(var da = new SqlDataAdapter(sql, con))
{
da.Fill(table);
}

Categories

Resources