Will this process affect database availability?

Will this process affect database availability? - c#

http://rockingtechnology.blogspot.co.uk/2011/06/oracle-backup-and-restore-code-in-cnet.html
As per the proposed code in the above article, more specifically:
ProcessStartInfo psi = new ProcessStartInfo();
psi.FileName = "C:/oracle/product/10.2.0/db_1/BIN/exp.exe";
Process process = Process.Start(psi);
process.WaitForExit();
process.Close();
How can I expect the database to be affected with regards to interruption of CRUD operations from elsewhere once calling Process.Start(psi) and, hence, executing exp.exe?
Using Oracle's exp.exe process - will the sessions of all users currently writing to the db in question be killed, for example? I'd imagine (or at least hope) not, but I haven't been able to find documentation to confirm this.

EXP and IMP are not proper backup and recover tools. They are intended for exchanging data and data structures between Oracle databases. This is also true for their replacement, Data Pump (EXPDP and IMPDP).
Export unloads to a file so it won't affect any users on the system. However if you want a consistent set of data you need to use the CONSISTENT=Y parameter if there are any other users connecting to the system .
Interestingly Data Pump does not have a CONSISTENT parameter. It unloads tables (or table partitions) as single transactions but the only way to guarantee consistency across all database objects is to use the FLASHBACK_SCN parameter (or kick all your users off the system).
"It is all in aid of DR."
As a DR solution this will work, with the following provisos.
The users will lose all data since the last export (obvious)
You will need to ensure the export is consistent across all objects
Imports take time. A lot of time if you have many tables or a lot of data. Plus indexes, etc
Also remember to export the statistics as well as the data.

You're really asking what effects the (old) Oracle export tool (exp) has on the database. It's a logical backup so you can think of the effects generally the same way you would think of running multiple SELECT queries against your database. That is, other sessions don't get killed but normal locking mechanisms may prevent them from accessing data until exp is done with it and this could, potentially, lead to timeouts.

EXP is the original export utility. It is discontinued and not supported in the most recent version (11g).
You can use EXPDP instead, although the export files are written on the server instead of the client machine.
Both utilities issue standard SELECT commands to the database, and since readers don't interfere with concurrency in Oracle (writer don't block readers, readers don't block readers), this will not block your other DB operations.
Since it issues statements however, it may increase the resource usage, especially IO, which could impact performance for concurrent activity.
Whatever tool you use, you should spend some time learning about the options (also since you may want to use it as a logical copy, make sure you test the respective import tools IMP and IMPDP). Also a word of warning: these tools are not backup tools. You should not rely on them for backup.

Related

Using SQLite for logging

I want to use SQLite as database backend for WCF service logging. Everything looks good, but how can I extract database log file from real system and don't get database locked for inspecting/analyzing of logs? This system seems to be very load extensive and everytime I try to take database log file it takes away locked.

SQLite permits to have multiple processes/applications to have the same database file open for reading and writing (there is some locking involved when writing, but this is not typical to have big problems with it).
You should be able to have logging process to continuously log new rows into your database, and at the same time have offloading/extracting process copy older rows to some other location.
If you should not, however, copy database as a file using standard copy function while it is open, because it will most likely corrupt it (on Windows, it maybe even impossible due to strict locking).
Instead, have offloading process connect to database using standard SQLite API (or some scripting language with SQLite support), read rows using that API and create copy of that data elsewhere, for example in another SQLite database or in big "real" SQL database like MySQL, Postgres or MSSQL (or in text file if so inclined).

Use SQLite's online backup API to extract the data without having to lock the DB.

You can now use SQLite's Write-Ahead Logging mode to enable a SQLite database to be read and written to concurrently.
There are advantages and disadvantages to using WAL instead of a rollback journal. Advantages include:
WAL is significantly faster in most scenarios.
WAL provides more
concurrency as readers do not block writers and a writer does not
block readers. Reading and writing can proceed concurrently.
Disk
I/O operations tends to be more sequential using WAL.
WAL uses
many fewer fsync() operations and is thus less vulnerable to
problems on systems where the fsync() system call is broken.

sync local files with server files

Scenario: I want to develop an application.The application should be able to connect to my remote server and download data to the local disk , while downloading it should check for new files and only download the new ones simultaneously creating the required(new) folders.
Problem: I have no idea how to compare the files in the server with the ones in the local disk.How to download only the new files from the server to the local disk?
What am thinking?: I want to sync the files in the local machine with the ones in the server. I am planning to use rsync for syncing but i have no idea how to use it with ASP.NET.
Kindly let me know if my approach is wrong or is there any other better way to accomplish this.

First you can compare the file names, then the file size and when all matches, you can compare the hashes of the files.

I call this kind of a problem a "data mastering" problem. I synchronize our databases with a Fortune 100 company throughout the week and have handled a number of business process issues.
The first rule of handling production data is not to do your users' data entry. They must be responsible for putting any business process into motion which touches production. They must understand the process and have access to logs showing what data was changed, otherwise they cannot handle issues. If you're doing this for them, then you are assuming these responsibilities. They will expect you to fix everything when problems happen, which you cannot feasibly do because IT cannot interpret business data or its relevance. For example, I handle delivery records but had to be taught that a duplicate key indicated a carrier change.
I inherited several mismanaged scenarios where IT simply dumped "newer" data into production without any further concern. Sometimes I get junk data, where I have to manually exclude incoming records from the mastering process because they have invalid negative quantities. Some of my on-hand records are more complete than incoming data, and so I have to skip synchronizing specific columns. When one application's import process simply failed, I had to put an end to complaints by creating a working update script. These are issues you need to think ahead about, because they will encourage you to organize control of each step of the synchronization process.
Synchronization steps:
Log what is there before you update
Download and compare local vs remote copies for differences; you cannot compare the two without a) having them both in the same physical location or b) controlling the other system
Log what you're updating with, and timestamp when you're updating it
Save and close the logs
Only when 1-4 are done should you post an update to production
Now as far as organizing a "mastering" process goes, which is what I call comparing the data and producing the lists of what's different, I have more experience to share. For one application, I had to restructure (decentralize) tables and reports before I could reliably compare both sources. This implies a need to understand the business data and know it is in proper form. You don't say if you're comparing PDFs, spreadsheets or images. For data, you must write a separate mastering process for each table (or worksheet), because the mastering process's comparison step may be specially shaped by business needs. Do not write one process which masters everything. Make each process controllable.
Not all information is compared the same way when imported. We get in PO and delivery data and therefore compare tens of thousands of records to determine which data points have changed, but some invoice information is simply imported without any future checks or synchronization. Business needs can even override updates and keep stale data on your end.
Each mastering process's comparer module can then be customized as needed. You'll want specific APIs when comparing file types like PDFs and spreadsheets. I use EPPlus for workbooks. Anything you cannot open has to be binary compared, of course.
A mastering process should not clean or transform the data, especially financial data. Those steps need to occur prior to mastering so that these issues are caught before mastering is begun.
My tools organize the data in 3 tabs -- Creates, Updates and Deletes -- each with DataGridViews showing the relevant records. Then I can log, review and commit changes or hand the responsibility to someone willing.
Mastering process steps:
(Clean / transform data externally)
Load data sources
Compare external to local data
Hydrate datasets indicating Creates, Updates and Deletes

MongoDB in desktop application

Is it a good idea to use MongoDB in .NET desktop application?

Mongo is meant to be run on a server with replication. It isn't really intended as a database for desktop applications (unless they're connecting to a database on a central server). There's a blog post on durability on the MongoDB blog, it's a common question.
When a write occurs and the write command returns, we can not be 100%
sure that from that moment in time on,
all other processes will see the
updated data only.
In every driver, there should be an option to do a "safe" insert or update, which waits for a database response. I don't know which driver you're planning on using (there are a few for .NET, http://github.com/samus/mongodb-csharp is the most officially supported), but the driver doesn't offer a safe option, you can run the getLastError command to synchronize things manually.
MongoDB won’t make sure your data is on the hard drive immediately. As a
result, you can lose data that you
thought was already written if your
server goes down in the period between
writing and actual storing to the hard
drive.
There is an fsync command, which you can run after every operation if you really want. Again, Mongo goes with the "safety in numbers" philosophy and encourages anyone running in production to have at least one slave for backup.

It depends on what you want to store in a database.
According to Wikipedia;
MongoDB is designed for problems
without heavy transactional
requirements that aren't easily solved
by traditional RDBMSs, including
problems which require the database to
span many servers.
There is a .NET driver available. And here is some information to help you getting started.
But you should first ask yourself; what do you want to store and what are the further requirements. (support for Stored Procedures, Triggers, expected size, etc etc)

MS Access interop - Data Import

I am working on a exe to export SQL to Access, we do not want to use DTS as we have multiple clients each exporting different views and the overhead to setup and maintain the DTS packages is too much.
*Edit: This process is automated for many clients every night, so the whole process has to be kicked off and controlled within a cursor in a stored procedure. This is because the data has to be filtered per project for the export.
I have tried many ways to get data out of SQL into Access and the most promising has been using Access interop and running a
doCmd.TransferDatabase(Access.AcDataTransferType.acImport...
I have hit a problem where I am importing from views, and running the import manually it seems the view does not start returning data fast enough, so access pops up a MessageBox dialog to say it has timed out.
I think this is happening in interop as well, but because it is hidden the method never returns!
Is there any way for me to prevent this message from popping up, or increasing the timeout of the import command?
My current plan of attack is to flatten the view into a table, then import from that table, then drop the flattened table.
Happy for any suggestions how to tackle this problem.
Edit:
Further info on what I am doing:
We have multiple clients which each have a standard data model. One of the 'modules' is a access exporter (sproc). It reads the views to export from a parameter table then exports. The views are filtered by project, and a access file is created for each project (every view has project field)
We are running SQL 2005 and are not moving to SQL 2005 quickly, we will probably jump to 2008 in quite a few months.
We then have a module execution job which executes the configured module on each database. There are many imports/exports/other jobs that run in this module execution, and the access exporter must be able to fit into this framework. So I need a generic SQL -> Access exporter which can be configured through our parameter framework.
Currently the sproc calls a exe I have written and my exe opens access via interop, I know this is bad for a server BUT the module execution is written so only a single module is executing at a time, so the procedure will never be running more than one instance at a time.

Have you tried using VBA? You have more options configuring connections, and I'm sure I've used a timeout adjustment in that context in the past.
Also, I've generally found it simplest just to query a view directly (as long as you can either connect with a nolock, or tolerate however long it takes to transfer); this might be a good reason to create the intermediate temp table.
There might also be benefit to opening Acces explicitly in single-user mode for this stuff.

We've done this using ADO to connect to both source and destination data. You can set connection and command timeout values as required and read/append to each recordset.
No particularly quick but we were able to leave it running overnight

I have settled on a way to do this.
http://support.microsoft.com/kb/317114 describes the basic steps to start the access process.
I have made the Process a class variable instead of a local variable of the ShellGetApp method. This way when I call the Quit function for access, if it doesn't close for whatever reason I can kill the process explicitly.
app.Quit(Access.AcQuitOption.acQuitSaveAll);
if (!accessProcess.HasExited)
{
Console.WriteLine("Access did not exit after being asked nicely, killing process manually");
accessProcess.Kill();
}
I then have used a method timeout function here to give the access call a timeout. If it times out I can kill the access process as well (timeout could be due to a dialog window popping up and I do not want the process to hang forever. I got the timeout method here.
Implement C# Generic Timeout

I'm glad you have a solution that works for you. For the benefit of others reading this, I'll mention that SSIS would have been a possible solution to this problem. Note that the difference between SSIS and DTS is pretty much night and day.
It is not difficult to parameterize the export process, such that for each client, you could export a different set of views. You could loop over the lines of a text file having the view names in it, or use a query against a configuration database to get the list of views. Otherparameters could come from the same configuration database, on a per-client and/or per-view basis.
If necessary, there would also be the option of performing per-client pre- and post-processing, by executing a child process, or pacakge, if such is configured.

Import Process maxing SQL memory

I have an importer process which is running as a windows service (debug mode as an application) and it processes various xml documents and csv's and imports into an SQL database. All has been well until I have have had to process a large amount of data (120k rows) from another table (as I do the xml documents).
I am now finding that the SQL server's memory usage is hitting a point where it just hangs. My application never receives a time out from the server and everything just goes STOP.
I am still able to make calls to the database server separately but that application thread is just stuck with no obvious thread in SQL Activity Monitor and no activity in Profiler.
Any ideas on where to begin solving this problem would be greatly appreciated as we have been struggling with it for over a week now.
The basic architecture is c# 2.0 using NHibernate as an ORM data is being pulled into the actual c# logic and processed then spat back into the same database along with logs into other tables.
The only other prob which sometimes happens instead is that for some reason a cursor is being opening on this massive table, which I can only assume is being generated from ADO.net the statement like exec sp_cursorfetch 180153005,16,113602,100 is being called thousands of times according to Profiler

When are you COMMITting the data? Are there any locks or deadlocks (sp_who)? If 120,000 rows is considered large, how much RAM is SQL Server using? When the application hangs, is there anything about the point where it hangs (is it an INSERT, a lookup SELECT, or what?)?
It seems to me that that commit size is way too small. Usually in SSIS ETL tasks, I will use a batch size of 100,000 for narrow rows with sources over 1,000,000 in cardinality, but I never go below 10,000 even for very wide rows.
I would not use an ORM for large ETL, unless the transformations are extremely complex with a lot of business rules. Even still, with a large number of relatively simple business transforms, I would consider loading the data into simple staging tables and using T-SQL to do all the inserts, lookups etc.

Are you running this into SQL using BCP? If not, the transaction logs may not be able to keep up with your input. On a test machine, try turning the recovery mode to Simple (non-logged) , or use the BCP methods to get data in (they bypass T logging)

Adding on to StingyJack's answer ...
If you're unable to use straight BCP due to processing requirements, have you considered performing the import against a separate SQL Server (separate box), using your tool, then running BCP?
The key to making this work would be keeping the staging machine clean -- that is, no data except the current working set. This should keep the RAM usage down enough to make the imports work, as you're not hitting tables with -- I presume -- millions of records. The end result would be a single view or table in this second database that could be easily BCP'ed over to the real one when all the processing is complete.
The downside is, of course, having another box ... And a much more complicated architecture. And it's all dependent on your schema, and whether or not that sort of thing could be supported easily ...
I've had to do this with some extremely large and complex imports of my own, and it's worked well in the past. Expensive, but effective.

I found out that it was nHibernate creating the cursor on the large table. I am yet to understand why, but in the mean time I have replaced the large table data access model with straight forward ado.net calls

Since you are rewriting it anyway, you may not be aware that you can call BCP directly from .NET via the System.Data.SqlClient.SqlBulkCopy class. See this article for some interesting perforance info.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.