Using SQLite for logging

Using SQLite for logging - c#

I want to use SQLite as database backend for WCF service logging. Everything looks good, but how can I extract database log file from real system and don't get database locked for inspecting/analyzing of logs? This system seems to be very load extensive and everytime I try to take database log file it takes away locked.

SQLite permits to have multiple processes/applications to have the same database file open for reading and writing (there is some locking involved when writing, but this is not typical to have big problems with it).
You should be able to have logging process to continuously log new rows into your database, and at the same time have offloading/extracting process copy older rows to some other location.
If you should not, however, copy database as a file using standard copy function while it is open, because it will most likely corrupt it (on Windows, it maybe even impossible due to strict locking).
Instead, have offloading process connect to database using standard SQLite API (or some scripting language with SQLite support), read rows using that API and create copy of that data elsewhere, for example in another SQLite database or in big "real" SQL database like MySQL, Postgres or MSSQL (or in text file if so inclined).

Use SQLite's online backup API to extract the data without having to lock the DB.

You can now use SQLite's Write-Ahead Logging mode to enable a SQLite database to be read and written to concurrently.
There are advantages and disadvantages to using WAL instead of a rollback journal. Advantages include:
WAL is significantly faster in most scenarios.
WAL provides more
concurrency as readers do not block writers and a writer does not
block readers. Reading and writing can proceed concurrently.
Disk
I/O operations tends to be more sequential using WAL.
WAL uses
many fewer fsync() operations and is thus less vulnerable to
problems on systems where the fsync() system call is broken.

Related

Will this process affect database availability?

http://rockingtechnology.blogspot.co.uk/2011/06/oracle-backup-and-restore-code-in-cnet.html
As per the proposed code in the above article, more specifically:
ProcessStartInfo psi = new ProcessStartInfo();
psi.FileName = "C:/oracle/product/10.2.0/db_1/BIN/exp.exe";
Process process = Process.Start(psi);
process.WaitForExit();
process.Close();
How can I expect the database to be affected with regards to interruption of CRUD operations from elsewhere once calling Process.Start(psi) and, hence, executing exp.exe?
Using Oracle's exp.exe process - will the sessions of all users currently writing to the db in question be killed, for example? I'd imagine (or at least hope) not, but I haven't been able to find documentation to confirm this.

EXP and IMP are not proper backup and recover tools. They are intended for exchanging data and data structures between Oracle databases. This is also true for their replacement, Data Pump (EXPDP and IMPDP).
Export unloads to a file so it won't affect any users on the system. However if you want a consistent set of data you need to use the CONSISTENT=Y parameter if there are any other users connecting to the system .
Interestingly Data Pump does not have a CONSISTENT parameter. It unloads tables (or table partitions) as single transactions but the only way to guarantee consistency across all database objects is to use the FLASHBACK_SCN parameter (or kick all your users off the system).
"It is all in aid of DR."
As a DR solution this will work, with the following provisos.
The users will lose all data since the last export (obvious)
You will need to ensure the export is consistent across all objects
Imports take time. A lot of time if you have many tables or a lot of data. Plus indexes, etc
Also remember to export the statistics as well as the data.

You're really asking what effects the (old) Oracle export tool (exp) has on the database. It's a logical backup so you can think of the effects generally the same way you would think of running multiple SELECT queries against your database. That is, other sessions don't get killed but normal locking mechanisms may prevent them from accessing data until exp is done with it and this could, potentially, lead to timeouts.

EXP is the original export utility. It is discontinued and not supported in the most recent version (11g).
You can use EXPDP instead, although the export files are written on the server instead of the client machine.
Both utilities issue standard SELECT commands to the database, and since readers don't interfere with concurrency in Oracle (writer don't block readers, readers don't block readers), this will not block your other DB operations.
Since it issues statements however, it may increase the resource usage, especially IO, which could impact performance for concurrent activity.
Whatever tool you use, you should spend some time learning about the options (also since you may want to use it as a logical copy, make sure you test the respective import tools IMP and IMPDP). Also a word of warning: these tools are not backup tools. You should not rely on them for backup.

Loading on-disk SQLite database into in-memory database and syncing back

I have a SQLite database that has pretty intensive repeated reads and occasional writes. However, the writes (because of indexing) tend to block the reads. I would like to read the on-disk database into a in-memory database and then have a way of syncing back to the on-disk when the machine is completely idle for maybe 5-10 seconds. I was briefly tempted to copy the tables from an attached on-disk database to an in-memory database, but it seems there should be superior way. I also considered transactions which are committed when the machine is idle (but will this block the intensive reads). The reads include the tables to be updated (or inserted), but the writes are not time-sensitive.

You should upgrade to SQLite 3.7.0 or later which includes Write Ahead Logging. This new method of locking allows reads while writing.
http://www.sqlite.org/draft/wal.html
To copy between an in-memory database and an on-disk database, you can use the backup API but it's not exposed through the .NET wrapper yet.
Also, by increasing your cache-size you can get the same performance from an on-disk database as an in-memory database--the whole thing can be cached in memory.
Another option is using Oracle's new version of BerkleyDB which has a SQLite front end including the same .NET wrapper and is a drop-in replacement for the official SQLite releases. They changed the locking mechanism to support page level locks instead of database level locks and greatly improved concurrency and therefore multi-connection performance. I haven't used it myself, but I read good things.
http://www.oracle.com/technetwork/database/berkeleydb/overview/index.html

IF a commercial library is an option - see http://www.devart.com/dotconnect/sqlite/
It comes (among other things) with support for in-memory-DB and has a component SQLiteDump which basically allows to do what you describe... it comes also with ADO.NET DataSet/DataTable support, LINQ, PLINQ, EF etc. and supports the latest SQListe versions...

Store Documents/Video in database or as separate files?

Is it a better practice to store media files (documents, video, images, and eventually executables) in the database itself, or should I just put a link to them in the database and store them as individual files?

Read this white paper by MS research (to BLOB or not to BLOB) - it goes in depth about the question.
Executive summary - if you have lots of small (150kb and less) files, you might as well store them in the DB. Of course, this is right for the databases they were testing with and using their test procedures. I suggest reading the article in full to at least gain a good understanding of the trade-offs.

That is an interesting paper that Oded has linked to - if you are using Sql Server 2008 with its FileStream feature the conclusion is similar. I have quoted a couple of salient points from the linked FileStream whitepaper:
"FILESTREAM storage is not appropriate in all cases. Based on prior research and FILESTREAM feature behavior, BLOB data of size 1 MB and larger that will not be accessed through Transact-SQL is best suited to storing as FILESTREAM data."
"Consideration must also be given to the update workload, as any partial update to a FILESTREAM file will generate a complete copy of the file. With a particularly heavy update workload, the performance may be such that FILESTREAM is not appropriate"

Two requirements drive the answer to your question:
Is there more than one application server reading binaries from the database server?
Do you have a database connection that can stream binaries for write and read?
Multiple application servers pulling binaries from one database server really hinders your ability to scale. Consider that database connections are usually - necessarily - coming from a smaller pool than the application servers' request servicing pool. And, the data volume binaries will consume being sent from database server to application server over the pipe. The database server will likely queue requests because its pool of connections will be consumed delivering binaries.
Streaming is important so that a file is not completely in server memory on read or write (looks like #Andrew's answer about SQL Server 2008 FILESTREAM may speak to this). Imagine a file several gigabytes in size - if read completely into memory - would be enough to crash many application servers, which just don't have the physical memory to accommodate. If you don't have streaming database connections storing in the database is really not viable, unless you constrain file size such that your application server software is allocated at least as much memory as the max file size * number of request servicing connections + some additional overhead.
Now let's say you don't put the files in the database. Most operating systems are very good at caching frequently accessed files. So right off the bat you get an added benefit. Plus, if you're doing web servers, they are pretty good at sending back the right request headers, such as mime type, content length, e-tags, etc... which you otherwise end up coding yourself. The real issues are replication between servers, but most application servers are pretty good at doing this via http - streaming the read and write, and as another answerer pointed out keeping database and file system in sync for backups.

Storing BLOB data in database is not considered right way to go unless they are very small. Instead storing their path is more appropriate. it will greatly improve database query and retrieval performance.

Here is detailed comparison I have made
http://akashkava.com/blog/127/huge-file-storage-in-database-instead-of-file-system/

SQLite & C# ] How can I control the number of people editing a db file?

I'm programming a simple customer-information management software now with SQLite.
One exe file, one db file, some dll files. - That's it :)
2~4 people may be going to run this exe file simultaneously and access to a database.
Not only just reading but frequent editing will be done by them too.
Yeahhh now here comes the one of the most famous problems... "Synchronization"
I was trying to create / remove a temporary empty file whenever someone is trying
to edit it. (this is a 'key' to access the db.)
But there must be a better way for it : (
What would be the best way of preventing this problem?

Well, SQLite already locks the database file for each use, the idea being that multiple applications can share the same database.
However, the documentation for SQLite explicitly warns about using this over the network:
SQLite will work over a network
filesystem, but because of the latency
associated with most network
filesystems, performance will not be
great. Also, the file locking logic of
many network filesystems
implementation contains bugs (on both
Unix and Windows). If file locking
does not work like it should, it might
be possible for two or more client
programs to modify the same part of
the same database at the same time,
resulting in database corruption.
Because this problem results from bugs
in the underlying filesystem
implementation, there is nothing
SQLite can do to prevent it.
A good rule of thumb is that you
should avoid using SQLite in
situations where the same database
will be accessed simultaneously from
many computers over a network
filesystem.
So assuming your "2-4 people" are on different computers, using a network file share, I'd recommend that you don't use SQLite. Use a traditional client/server RDBMS instead, which is designed for multiple concurrent connections from multiple hosts.
Your app will still need to consider concurrency issues (unless it speculatively acquires locks on whatever the user is currently looking at, which is generally a nasty idea) but at least you won't have to deal with network file system locking issues as well.

You are looking at some classic problems in dealing with multiple users accessing a database: the Lost Update.
See this tutorial on concurrency:
http://www.brainbell.com/tutors/php/php_mysql/Transactions_and_Concurrency.html
At least you won't have to worry about the db file itself getting corrupted by this, because SQLite locks the whole file when it's being written. That being said, SQLite doesn't recommend you to use it if you expect your app to be accessed simultaneously by a multiple clients.

MongoDB in desktop application

Is it a good idea to use MongoDB in .NET desktop application?

Mongo is meant to be run on a server with replication. It isn't really intended as a database for desktop applications (unless they're connecting to a database on a central server). There's a blog post on durability on the MongoDB blog, it's a common question.
When a write occurs and the write command returns, we can not be 100%
sure that from that moment in time on,
all other processes will see the
updated data only.
In every driver, there should be an option to do a "safe" insert or update, which waits for a database response. I don't know which driver you're planning on using (there are a few for .NET, http://github.com/samus/mongodb-csharp is the most officially supported), but the driver doesn't offer a safe option, you can run the getLastError command to synchronize things manually.
MongoDB won’t make sure your data is on the hard drive immediately. As a
result, you can lose data that you
thought was already written if your
server goes down in the period between
writing and actual storing to the hard
drive.
There is an fsync command, which you can run after every operation if you really want. Again, Mongo goes with the "safety in numbers" philosophy and encourages anyone running in production to have at least one slave for backup.

It depends on what you want to store in a database.
According to Wikipedia;
MongoDB is designed for problems
without heavy transactional
requirements that aren't easily solved
by traditional RDBMSs, including
problems which require the database to
span many servers.
There is a .NET driver available. And here is some information to help you getting started.
But you should first ask yourself; what do you want to store and what are the further requirements. (support for Stored Procedures, Triggers, expected size, etc etc)

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.