Datagrid with large number of rows

Datagrid with large number of rows - c#

In my WPF application, I've got a screen with a tab control. Five of these tabs contain datagrids which need to display a large number of rows (at least 5000). The tables are bound to ObservableCollections of Part objects. Each row displays around 20 points of part data. My problem is that, after the user enters the information they require and generate the data, clicking on a tab causes the application to hang for 30-60 seconds. After this the datagrid finally loads, and with the right virtualization settings, they perform at an acceptable rate (not exactly fast, but not too slow). If I disable virtualization, the program uses up way too much memory, and the loading time isn't really affected.
The most offensive tables consist of about half a dozen template columns. Each template contains controls inside a stackpanel or a grid; basically each row is split into two, like a double-row. This layout is a requirement, and paging is probably not something that the customer is willing to accept.
This is the most important screen in my application and I'm pretty much at a loss about making this work. Is there anything I can do to speed up this process? Perhaps ObservableCollection is the wrong choice?

Can you please provide more insights...
Can you check how much time is spent in "generating" the 5 collections of 5000 rows each? (this is what I assume you are saying)
With virtulaization "on" what is the UI loading time "after" we assign the collection to the items source?
What happen if you bind "ItemsSource" to the respective datagrid only when the tabItem is actually Visible \ Selected?
Do you datagrids have default sort member path? Grouping? Filter Paths?
These are a few things I would target to start on the issue.

Related

Handle rapidly changing datasource for wpf

I have rapid incoming data (which are basically rows of a tabular data set). As i receive these rows, I merge/upsert them into the data cache. The data has to be shown on a WPF control (an item control).
The problem :
The data is not directly bound to the user control. A series of filter and grouping/aggregation (done using LINQ) is applied on the data before it is shown to the data. Thus what the user sees on the control can drastically change (say he changes the grouping, then all the rows will change).
This is what I am doing as of now :
As the data is coming in VERY rapidly, a thread is picking up the data every 2 seconds, applying the filter and then grouping the data and binding the data set to the wpf item control.
This is definitely not good as a new data table is being set as the datasource every 2 seconds. The application becomes laggy after some time.
What will be the best approach for me to solve this problem ? Thanks.

I'd say the best way to approach this would be to create a view source and bind the display element directly to that - that way you've got the grouping and filtering systems using the built in stuff.
I'd also then look into virtualisation and optimisation which will help with showing a lot of data.
As a more general note, I'd say explore whether you really need all that data to be displayed and at that frequency; it might be a better user experience (and easier on the machine) to aggregate the data and just surface pertinent details, that the user can then drill-down into further as required.

Data Virtualization and user-concurrency

Does anyone have any pointers / examples on how to resolve many issues that are possible in multi-user scenarios when using data-virtualization. Lets say that we are talking about WPF and DataGrid. Implementing a virtualized collection which loads on demand is not too difficult. However, without a staging area where temporary results of the original query are stored, we get into concurrency issues like:
Loading new page could fetch incorrect data (concurrent user adds and removes some records, leading to same total count of records, but which results in page fetching duplicate entries that are already displayed somewhere above in the grid)
Preserving user-selection in the grid when scrolling and loading new pages, in which there could be a possibility that once selected items have expired from cache, and once reloaded, we find out that someone deleted them. We can deselect everything and show the message to the user, but :/ Also, if selecting with Shift-click (multiselect) somewhere close to the end of the list, what should be done when some items "appear" in the middle of list upon loading some of the middle pages (concurrent user added items).

It is useful to keep in mind that nothing on the screen of your user is technically up to date. The moment you show it you are lagging on the master dataset.
1) Yes, of course. But you can always keep track of the first record you are showing on your grid and get your next page sized set from there. Those which are deleted will drop out of view, of course. Optionally, you might try and use a library such as ZeroMQ or RabbitMQ and broadcast dataset changes and update your datagrid live if they are currently shown. It still won't be perfectly in sync, obviously but you will reduce the window in which they won't be in sync.
2) When you have selected items you keep track of their primary keys. I don't know what you want to do with those which are already deleted from the master set. But you can always act on all the others, right? Even if they aren't shown anymore you can track the PK's. And reselect them when loading a page.

Better solution for 600+ elements in a scroll view in WP7

I have a scroll view that holds about 614 Grid Controls (it's used as a book index, with each grid points to certain place at the book), inside each grid about 4 textblocks showing information about that choice....
The content is static inside all the textblocks. The thing is, when loading all that content , the phone becomes quiet unresponsive for a while... it takes time to load that page and navigate to it from another pages.
I want another solution for all that items to be shown correctly and also each grid view of the 600 has it's own clicked event handler to be able to point it to the page in the book.
I read about some hard ways to do that, I was thinking maybe I can only load the index as a very "tall" image with the index written inside it and then detect where the user tapped and calculate the index page from that ? is that efficient? or maybe there's something else ?

What is happening is the scroll view is iterating through all 600 items to measure the height of each entry so that it knows how big to render the scrollbars.
It is better to use a ListBox in this case before WP7 will only render the visible items only. Even then, I've heard of performance issues when you hit 2000 rows.
If you are interested in how virtualization works, Samuel Jack has written one that scales well (albeit not for WP7), but he has detailed writeups on the decisions he made.
https://github.com/samueldjack/VirtualCollection/tree/master/VirtualCollection/VirtualCollection
See his write ups on:
Data Virtualization and Stealth Paging
Silverlights Virtual Collection
A Virtualizing Wrap Panel

Assume two observable collections A and B. Bind your collection A to your UI. Every time fill you collection B. Everytime whne UI is refreshed clear A. Once the UI is loaded, via an event trigger start filling of items from B -> A, as it is an Observable Collection and if you are using INotifyPropertyChanged correctly the items will start appearing on the UI one by one. (Lazy Loading). You may alter this approach according your implementation. I myself am following this approach. Hope it helps for you too.

Playing with buttons in C#.net using visual-studio

Well it's not playing actually.
I have a database with about 200 list of items in it. I've used DataTable to fetch all the data in single connection.
Then created a windows button that creates new button for all the items.
It is OK and I was able to do it easily.
But I stuck over two things..
First is, I have limited space in my windows form, that's why I want to load only 30 buttons at first and then upon second click event, I want to load buttons for remaining 30 items and so on..
Second problem is, even if i managed to solve the first problem? How to arrange them in proper row/column?
Please help.

Grab an ordered list of records, split it to a list of "pages" (which is also a list of records) and use navigation buttons to change the context of current page.

Why don't you take a DataGridView with a BindingSource and a DataGridViewButtonColumn? With this as a starting point you can simply glue them together by calling:
myDataGridView.DataSource = myBindingSource;
myBindingSource.DataSource = myDataTable;
Update
Surely you can try to do the whole visualization on yourself by using a TableLayoutControl. But the DataGridView is a control that is specialized to visualize data in a data grid (hence the name of it).
The grid view is a very complex control, but it has a lot of nice features which make your results looking more professional by simply configuring some properties of it. For example simply set the property AutoSizeColumnsMode to Fill to simply avoid horizontal scroll bars and set the Column.AutoSizeMode of some columns to e.g. DisplayedCells to enforce which columns should be wrapped, etc.
Also there are a lot of features regarding to data validation, formatting, etc. So i think even if the step-in hurdle is a little higher you got a much better visualization then trying to do all this stuff manually by taking a TableLayoutPanel. Last but not least there are lots of examples about how to use the specific properties within the MSDN and if you get really stuck just search for the problem here on SO or on the web and if you don't find a proper solution just ask a question here on SO.

Does the Windows Forms DataGridView implement a true virtual mode?

I have a SQL table containing currently 1 million rows that will grow over time.
There is a specific user requirement to present a sortable grid that displays all rows without paging. The user expects to be able to very quickly jump from row to row and top to bottom by using the scrollbar.
I am familiar with "virtual mode" grids that only present a visible subset of the overall data. They can provide excellent UI performance and minimal memory requirements, (I've even implemented an application using this technique many years ago).
The Windows Forms DataGridView provides a virtual mode that looks like it should be the answer. However unlike other virtual modes I've encountered, it still allocates memory for every row (confirmed in ProcessExplorer). Obviously this causes overall memory usage to needlessly greatly increase and, while allocating these rows, there is a noticeable delay. Scrolling performance also suffers on 1 million + rows.
A real virtual mode would have no need to allocate any memory for rows not being displayed. You just give it the total row count (eg 1,000,000) and all the grid does is scale the scrollbar accordingly. When it is first displayed the grid simply asks for data the first n (say 30) visible rows only, instantaneous display.
When the user scrolls the grid, a simple row offset and the number of visible rows are provided and can be used to retrieve data from the data store.
Here's an example of the DataGridView code I'm currently using:
public void AddVirtualRows(int rowCount)
{
dataGridList.ColumnCount = 4;
dataGridList.AutoSizeColumnsMode = DataGridViewAutoSizeColumnsMode.None;
dataGridList.AutoSizeRowsMode = DataGridViewAutoSizeRowsMode.None;
dataGridList.VirtualMode = true;
dataGridList.RowCount = rowCount;
dataGridList.CellValueNeeded += new DataGridViewCellValueEventHandler(dataGridList_CellValueNeeded);
}
void dataGridList_CellValueNeeded(object sender, DataGridViewCellValueEventArgs e)
{
e.Value = e.RowIndex;
}
Am I missing anything here, or is the "virtual" mode of the DataGridView not really virtual at all?
[Update]
It looks like the good old ListView implements exactly the sort of virtual mode I'm looking for. But unfortunately the ListView does not have the cell formatting capabilities of the DataGridView, so I can't use it.
For others that might be able to, I tested it with a four column ListView (in Detail mode), VirtualMode= True and VirtualListSize =100,000,000 rows.
The list is displayed immediately with the first 30 rows visible. I can then scroll rapidly to the bottom of the list with no delay. The memory usage is constant 10 MB at all times.

We just had a similar requirement to be able to display arbitrary, unindexed 1M+ row tables in our application with "very good" performance, using the stock DataGridView. At first I thought it wasn't possible, but with enough head scratching, we came up with something that works very well after spending days pouring over Reflector and .NET Profiler. This was difficult to do, but the results were well worth it.
The way we tackled this problem was by creating a class that implements ITypedList and IBindingList (you can call it LargeTableView, for example) to manage the asynchronous retrieval and caching of information from the database. We also created a single PropertyDescriptor-inheriting class (e.g. LargeTableColumnDescriptor) to retrieve data from each column.
When the DataGridView.DataSource property is set to a class implementing IBindingList, it goes into a pseudo-virtual mode that differs from regular VirtualMode, where as when each row is painted (such as when the user scrolls), the DataGridView accesses the indexer [] on the IBindingList and the respective GetValue methods on each column's PropertyDescriptor to retrieve the values as needed. The CellValueNeeded event is not raised. In our case, we access the database when the indexer is accessed, and then cache the value, so that subsequent re-paints don't hit the database.
I performed similar tests re: memory usage. The DataGridView does allocate an array that is the size of the list (i.e. 1M rows), however each item in the array initially references a single DataGridViewRow, so memory usage is acceptible. I am not sure if the behavior is the same when VirtualMode is true. We were able to eliminate scroll lag by immediately returning String.Empty in the GetValue method if the row is not cached, and then performing the database query asynchronously. When the async request is finished, you can raise the IBindingList.ListChanged event to signal to the DataGridView that it should repaint the cells, except this time reading from the cache which is readily available. That way, the UI is never blocked waiting for database calls.
One thing we noticed is that performance is significantly better if you set the DataSource or number of virtual rows before adding the DataGridView to the form - it cut initialization time in half. Also, make sure that you have both Row and Column autosizing set to None or else you will have additional performance problems.
Side note: the way we accomplished "loading" such a large table in our .NET application was by creating a temporary table on the SQL server that listed the primary keys in the desired sort order along with an IDENTITY (row number), and then persisting the connection for subsequent row requests. This naturally takes time to initialize (approx. 3-5s on a reasonably fast SQL server), but without knowledge of available indexes, we have no better alternative. Then, in our ITypedList implementation, we request rows in pages of 100 rows, where the 50th row is the row that is being painted, so that we limit the number of queries performed each time the indexer is accessed, and that we give the appearance of having all of the data available in our application.
Further reading:
http://msdn.microsoft.com/en-us/library/ms404298.aspx
http://msdn.microsoft.com/en-us/library/system.componentmodel.ibindinglist.aspx

The answer is NO
see first comment here
if anyone knows a better way please tell us

I would say yes... as long as you stick to the events triggered by the Virtual Mode behavior (e.g. CellValueNeeded) and that you take good care of clearing your hand built-buffer. I already displayed large amount of data, more than 1M without a fuss.
I'm a little bit curious about the implementation of Kevin McCormick using a DataSource based on ITypedList or any IList related interface implementations. I guess it's just yet another abstraction layer who makes the use of an internal and transparent buffer in addition to let the user or the developer powers the DataGridView with this, but still internally dealing with the native VirtualMode to display the information you have loaded in the buffer.
Besides of the fashion way to workaround the virtual mode, to me, the the only tough problem remaining with the DataGridView is the RowCount limitation: it's still a Int32.Max. It is probably due to the legacy of the Winforms Drawing System... just like the Images or even every Size Members, Width, Height of Winform Controls... why not stick to a UInt32 Type?
I assume that nobody has seen a Control or a picture with negative dimensions, but still the type is not properly appropriate to the context of use.
See my answer there below, could help you if you are still stuck with that trouble, event if, I guess it's already resolved for ages now.
https://stackoverflow.com/a/16373108/1906567

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.