Data Sync with Sync Framework

Data Sync with Sync Framework - c#

I've been working with the Sync Framework from microsoft with c# trying to synchronize about 35 tables from a local database to a database stored in a central server, the main problem is that one of my tables has more or less than 1million records, and even with the filters it takes to long to synchronize, i dont know if there's a way or any other framework that works a bit faster than this.
for the complete sync it takes about 4-6 hours.
any help will be good, thanks in advance.

Sync framework is just like any other database app and you can troubleshoot/optimize it similarly.
You can enable sync framework tracing to see where its spending its time: querying for changes, serialising changes, applying changes, locks/concurrency issues, network latency, etc...
Do you have pre-existing data on the destination DB? Have you tried initializing replicas from backup? Have you enabled batching? etc...

I am not sure if you still looking for a solution, but while using sync select upload / download incremental changes. This way every time you sync only new changes will be transferred. This way the process will consume time on the first sync only, later on only changes will be uploaded / downloaded, this it will be fairly quick.

If it has million records it will take lot of time to sync...Better is to build ur own sync architecture...we faced same issue then we created our own framework with rest based web apis returning Json formats.Only last updated rows will be synced based on last updated time...dnt need to compare whole database...

Related

How to implement a C# Winforms database application with synchronisation?

Background
I am developing a C# winforms application - currently up to about 11000 LOC and the UI and logic is about 75% done but there is no persistence yet. There are hundreds of attributes on the forms. There are 23 entities/data classes.
Requirement
The data needs to be kept in an SQL database. Most of the users operate remotely and we cannot rely on them having a connection so we need a solution that maintains a database locally and keeps it in synch with the central database.
Edit: Most of the remote users will only require a subset of the database in their local copy. This is because if they don't have access permissions (as defined and stored in my application) to view other user's records, they will not receive copies of them during synchronisation.
How can I implement this?
Suggested Solution
I could use the Microsoft Entity Framework to create a database and the link between database and code. This would save a lot of manual work as there are hundreds of attributes. I am new to this technology but have done a "hello world" project in it.
For data synch, each entity would have an integer primary key ID. Additionally it would have a secondary ID column which relates to the central database. This secondary column would contain nulls in the central database but would be populated in the local databases.
For synchronisation, I would write code which copies the records and assigns the IDs accordingly. I would need to handle conflicts.
Can anyone foresee any stumbling blocks to doing this? Would I be better off using one of the recommended solutions for data sychronisation, and if so would these work with the entity framework?

Synching data between relational databases is a pain. Your best course of action is probably dependent on: how many users will there be? How probably are conflicts (i.e. that the users will work offline on the same data). Also possibly what kind of manpower do you have (do you have proper DBAs/Sql Server devs standing by to assist with the SQL part, or are you just .NET devs).
I don't envy you this task, it smells of trouble. I'd especially be worried about data corruption and spreading that corruption to all clients rapidly. I'd put extreme countermeasures in place before any data in the remote DB gets updated.
If you predict a lot of conflicts - the same chunk of data gets modified many times by multiple users - I'd probably at least consider creating an additional 'merge' layer to figure out, what is the correct order of operations to perform on the remote db.
One thought - it might be very wrong and crazy, but just the thing that popped in my mind - would be to use JSON Patch on the entities, be it actual domain objects or some configuration containers. All the changes the user makes are recorded as JSON Patch statements, then applied to the local db, and when the user is online - submitted - with timestamps! - to merge provider. The JSON Patch statements from different clients could be grouped by the entity id and sorted by timestamp, and user could get feedback on what other operations from different users are queued - and manually make amends to it. Those grouped statments could be even stored in a files in a git repo. Then at some pre-defined intervals, or triggered manually, the update would be performed on a server-side app and saved to the remote db. After this the users local copies would be refreshed from server.
It's just a rough idea, but I think that you need something with similar capability - it doesn't have to be JSON Patch + Git, you can do it in probably hundreds of ways. I don't thing though, that you will get away with just going through the local/remote db and making updates/merges. Imagine the scenario, where user updates some data (let's say, 20 fields) offline, another makes completely different updates to 20 fields, and 10 of those are common between the users. Now, what should the synch process do? Apply earlier and then latter changes? I'm fairly certain that both users would be furious, because their input was 'atomic' - either everything is changed, or nothing is. The latter 'commit' must be either rejected, or users should have an option to amend it in respect of the new data. That highly depends what your data is, and as I said - what will be number/behaviour of users. Duh, even time-zones become important here - if you have users all in one time-zone you might get away with having predefined times of day when system synchs - but no way you'll convince people with many different business hours that the 'synch session' will happen at e.g. 11 AM, when they are usually giving presentation to management or sth ;)

Entity Framework Code-First too slow at startup

I know this has been asked a lot before, but I still haven't found a working fix.
I'm creating a desktop application which will regularly be started and stopped. The database is a MySQL database stored online and I'm using the newest version of EF and the MySQL connector.
I'm also working code-first. For now, I only have 3 small entities, but these will grow a lot in time. The database is generated already at startup, so nothing needs to be created anymore.
Every time the application is started (even when deployed), retreiving the first data from the database (only like 50 records, but I've also tried only 10 and it doesn't make any difference) is slow: around 5 seconds. After that, the next queries are pretty fast (around 1 second).
I've already tried generating views, but it doesn't change anything. I also create only 1 DbContext.
If I attempt to use ADO.NET, I get the results almost instantly, even on the first query (retreiving all 50 records), so it has nothing to do with a connection issue.
I'm not sure what information I have to give in order for you to help me, so feel free to ask more info.
Any idea what I could try? Is it really supposed to take like 5 seconds before the user can start working with the program?

On EF the first a query is run it has to be compiled even though the program is already complied.
I would suggest reading this http://www.codeproject.com/Articles/38174/How-to-improve-your-LINQ-query-performance-by-X and this https://msdn.microsoft.com/en-us/library/vstudio/bb896297%28v=vs.100%29.aspx and trying again to see if this helps.
Good luck!

Whats the best way to compare large amounts of data between two different databases?

I have one desktop application receiving data from a webservice and storing it inside a local postgresql database (while the webservice retrieves data from a SQL Server database). At the end of the process there will be a minimum of 2.5 million entries inside a table in my local database but this will be received from de webservice in batches of about 300 rows at time and within a time frame of about 15 days.
What I need is a way to make sure that my local database has the exact same information the server's database has.
I'm thinking of creating some sort of checksum for each batch received and then, after all batches were received, another checksum of the entire table but I don't know if this is the best solution and, if is, I don't know where to start to create it.
PS: TCP already handles integrity check so I don't even know if this is needed, but it is critical that the data are the same.

I can see how a checksum could possibly be useful, but the amount of transformation you're doing would probably make it impractical. You'd have to derive the checksum on either the original form of the data or on the transformed form; it wouldn't be valid on both.
You have some strange constraints (been there myself), so it's kind of hard to come up with a clear strategy without knowing all the details. Maybe one of the following suggestions would work.
A simple count(*) on the SQL Server side and on the PostgreSQL side after the migration is complete.
Dump out a list of keys from the SQL Server side and from the PostgreSQL side after the migration is complete, and then sort and compare those files.
If 1 and 2 aren't possible because of limited access to SQL Server, maybe dump out the results of the web service calls to a single file location as you go along, and then extract the same data from PostgreSQL at the end, and compare those files.
There are numerous tools available for comparing files if you choose options 2 or 3.

Do you have control over the web service and SQL Server DB? If you do, SQL Server Change Tracking should do the trick. MSDN Change Tracking will track every change (or just the changes you care about) on a per table basis. Each time you synchronize you just pass it your version number and it will return the changeset required to bring you up to date.

Any considerations before jumping into SQLite?

I have a WCF application that at present is using XML based file storage to store data that gets used to generate reports. Besides this processing decisions are made based on information stored in these XML files.
I'm now hitting volumes of around 30 000 text files. This is incredibly taxing, and the application at times comes to a grinding halt.
I've always wanted to swop out the XML DAL in favor of an RDBMS, but project managers simply won't allow it. But they would be willing to look at a serverless solution for example SQLLite. I am really tempted to just dive right in and start using it as a replacement DAL (Data Access Layer).
I would need no more than around 20 tables in the whole solution, and I would expect to get no more than around 20 000 - 100 000 transactions a day, however this is extreme, the real volumes would be less than this in most cases.
Update
I am not expecting a great deal of simultaneous connections, when I say transactions, I essentially mean 1 or 2 clients that make calls and execute against the database in order. At times there might be a possibility of external clients making quick calls to the DB. But the bulk of DB connections will be done by my WCF service, which is a back end scheduled task, not serving 100's of people across an organization.
Another good point is that I only need to retain data for 90 days, so the DB shouldn't grow too big.
My main concerns are:
How reliable is SQLLite? What if the DB File gets corrupted, will I loose all processing Data. How easy is the DB to back up? Will it handle my volumes? And lastly how well does the .net provider work (located here: http://sourceforge.net/projects/sqlite-dotnet2/).
If you have any experience with SQLLite, please post your experiences so I can make aan informed decision to switch or not.
Thanks in advance...

SQLite is as reliable as your OS and hardware.
Its transactional rate is similar to SQL server, and often faster because it's all in process.
The .NET ADO provider works great.
To back up the DB, stop the service and copy the file. If the journal file is present copy it too.
EDIT: SQLite uses UTF-8 by default so with the ADO-NET provider you should be able to avoid losing accents (just so long as you follow the typical XML in string rules).

You could consider Microsoft's Sql Compact Edition.
It's like sqlite, in terms of being a single file embedded database, but has better integration with the .net framework :)
SQLite seems reliable, and even with Microsoft's one, don't expect to receive much support in case of a corrupted database.

Given your transaction volume I'd say the fact that the DB itself is a single monolithic file with only file system locking available could be a problem.
There is no row based locking as far as I know.

I used SQLite with the .Net provider without problems in a monouser enviroment, except for one concern: accents, wich don't showed correcly. The backup is quite simply: the SQLite database is an plain text file. Simply copy it.

I use Sqlite for storing XML config data and have had no problems with it. I use the System.Data.Sqlite provider: http://sqlite.phxsoftware.com/. It's solid and has a good support forum. It also includes a LINQ provider. It also integrates with VS 2008 so you can use Server Explorer to query tables. The examples and documentation also show how to use parameterized commands and transactions for increased performance.
The release candidate for LinqPad now supports Sqlite: http://www.linqpad.net/Beta.aspx.
Sqlite stores everything in a single file, which can be backed up like any other binary file.
Sqlite only supports file-level locking, but shouldn't present a performance problem since it doesn't sound like you'll have a large number of simultaneous transactions.
Unicode shouldn't be a problem. This link in the forum addresses an area where someone was trying to read unicode characters with an incompatible utility http://sqlite.phxsoftware.com/forums/t/954.aspx.
This site shows how to do case-insenitive UTF8 comparisons using System.Data.Sqlite via a custom collator, with Russian characters as an example: http://www.codeproject.com/KB/database/SQLiteUTF8CIComparison.aspx.

Is it possible to sync not whole table data in MS Sync Framework?

I have mobile application, so i dont want to send/receive whole changes in tables..Just some data, that meets some filter terms. Is it possible to achieve with SF; if it is, please provide some resources to read about it, because i found almost nothing.
Thank You.

Yes its possible. For example you might only want to sync the records relating to a specific store, rather than all the changes in the store table.
You do this by adding a parameter to the SyncParameters collection. e.g.
m_SyncAgent.Configuration.SyncParameters.Add("#ParamName", paramValue)
This will pass the parameter data to the serverside of the Sync process, which you can then use to sync only the data you want to include.

It's definitely possible with SQL Server Replication Services (SSRS). You can select which tables, fields, and even apply filters to the publication. I'm not familiar with Sync Framework but SSRS subscriptions appear in the Sync Center, so my assumption is that Sync Framework uses SSRS.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.