The program that I am working on saves the snapshot of the current state to a xml file. I would like to store this in database (as blob) instead of xml.
Firstly, I think xml files are quite space-consuming and redundant, so we would like to compress the string in some way before storing in in the database. In addition, we would also like to introduce a simple cryptography so that people won't be able to figure out what it means without at least a simple key/password.
Note that I want to store it in the database as blob, so zipping it and then encrypting the zip file won't do, I guess.
How can I go about doing this?
Compress the XML data with DeflateStream and write it's output to a MemoryStream. Then call .ToArray() method to obtain your blob data. You can also do encryption with .NET in a similar way as well (after compression of course). If you believe deflate is not enough to save space, then try this library: XWRT.
Firstly, have a look at your serialization mechanism. The whole point of XML is that it's human readable. If that's no longer an important goal for you then it might be time to look at other serialization technologies which would be more suited to database storage (compressing XML into binary completely defeats the point of it :)
As an alternative format, BSON could be a good choice.
Related
I'm writing a project most for fun, just to play around with .NET. I'm building an xaml project (or what you call the windows 8 apps) in C#.
In this case I will have a bigger object with some lists of other abojects and stuff. Is there any smart way of saving these object to disk for loading them later? That is something like GetMyOldSavesObectWithName("MyObject");
What i've read that the local storage is primary used to save smaller thing, such as settings. How much data is acceptale to save? Can it handle a bigger object, and what are the pros/cons? Should i save the objects to files, and how do i in that case do that? Is there any smart way of telling .net to "Save this object to MyObjectName.xml" or something?
In C#, you can use serialization to save your objects as files, commonly in XML format. Other libraries, such as JSON.net, can serialize into JSON.
You could also roll out your own saving/loading format, which will probably run faster and store data in a more compact way, but will take much more time on your part. This can be done with BinaryReaders and Writers.
Take a look at this StackOverflow answer if you wish to go the serialization route.
In most cases data will be so compact it will not use much space at all. Based on your comment, that "large" amount of data would really only take a few KBs.
I need to log raw data off of sensors. I need features such as every 15 minutes, create a new log file or after the file reaches a certain size, create new file.
I'd like to leverage an existing framework such as log4net but it appears there isn't much out there on how to, or if it will support, adding a custom logger to log binary data. Has anyone done this or have come across an implementation of something similar that matches my needs as described throughout this post?
I should add that we are looking at ~300GB a day of data here. We are saving this data for the ability of post analysis and algorithm tweaking.
You could leverage log4net or any other text-logging tool by taking your byte[] data and converting it to plain text using Convert.ToBase64String. You can convert it back later using Convert.FromBase64String.
.NET has a BinaryReader and BinaryWriter class implemented. It does exactly what you expect it to do...it deals with raw bytes to/from a file (or any Stream for that matter). So all you have to do, is create a simple file format for yourself then read data out of it.
You can, of course, convert the binary data to other formats (like string) then use any serialization scheme you like (JSON, XML, etc. you name it). But since you're dealing with binary data, converting them to other formats sounds may not be the most elegant solution.
I have Serialized all the Dictionaries in my application to a file. So when I open this file I can see lots of information regarding my class names and etc which have been saved with the file.
So is this safe? Will everybody be able to just open a saved file created by my application and see what classes I've used? Here is the method I've used to Serialize my Objects.:
Serialization of two Dictionaries at once
What alternatives I have got to save my objects in my application to a file.
Yes, they will be able to see the structure of the serialized object (maybe if you serialize it to a binary file, it's a bit more difficult, it does not help much tho).
However, anyone can see your source code anyways, just think about .NET Reflector or ildasm. I personally wouldn't worry about it, I don't see any problem with this.
You can encrypt the file to hide it contents. So to read encrypted file you need to read it to the memory, decrypt and then pass to the deserialization Formatter.
In my opinion, you shouldn't be afraid of it.. but it depends on your need.
If you decide that it is important for you, I would recommend to store the data in some other place (remote storage).
You will have 3 alternatives for hiding the content:
Encrypt the object, and serialize it (best for local storage and postal) - http://msdn.microsoft.com/en-us/library/as0w18af(v=vs.110).aspx
serialize the object, and encrypt the file (worse, you will have to handle deleting the file)- http://support.microsoft.com/kb/307010
serialaztion to binary - the worst, doesn't really work - you can open the file in txt in figure out what's going on
So, if that is an importnant thing in your program, I think that the first method will be the best.
I need to compress a very large xml file to the smallest possible size.
I work in C#, and I prefer it to be some open source or application that I can access thru my code, but I can handle an algorithm as well.
Thank you!
It may not be the "smallest size possible", but you could use use System.IO.Compression to compress it. Zipping tends to provide very good compression for text.
using (var fileStream = File.OpenWrite(...))
using (var zipStream = new GZipStream(fileStream, CompressionMode.Compress))
{
zipStream.Write(...);
}
As stated above, Efficient XML Interchange (EXI) achieves the best available XML compression pretty consistently. Even without schemas, it is not uncommon for EXI to be 2-5 times smaller than zip. With schemas, you'll do even better.
If you're not opposed to a commercial implementation, you can use the .NET version of Efficient XML and call it directly from your C# code using standard .NET APIs. You can download a free trial copy from http://www.agiledelta.com/efx_download.html.
have a look at XML Compression Tools you can also compress it using SharpZipLib
If you have a schema available for the XML file, you could try EXIficient. It is an implementation of the Efficient XML Interchange (EXI) format that is pretty much the best available general-purpose XML compression method. If you don't have a schema, EXI is still better than regular zip (the deflate algorithm, that is), but not very much, especially for large files.
EXIficient is only Java but you can probably make it into an application that you can call. I'm not aware of any open-source implementations of EXI in C#.
File size is not the only advantage of EXI (or any binary scheme). The processing time and memory overhead are also greatly reduced when reading/writing it. Imagine a program that copies floating point numbers to disk by simply copying the bytes. Now imagine another program converts the floating point numbers to formatted text, and pastes them into a text stream, and then feeds that stream through an expensive compression algorithm. Because of this ridiculous overhead, XML is basically unusable for very large files that could have been effortlessly processed with a binary representation.
Binary XML promises to address this longstanding weakness of XML. It would be very easy to make a utility that converts between binary/text representations (without knowing the XML schema), which means you can still edit the files easily when you want to.
XML is highly compressible. You can use DotNetZip to produce compressed zip files from you XML.
if you require maximum compression level i would recommend LZMA. There is a SDK (including C#) that is part of the open source 7-Zip project, available here.
If you are looking for the smallest possible size then try Fast Infoset as binary XML encoding and then compress using BZIP2 or LZMA. You will probably get better results than compressing text XML or using EXI. FastInfoset.NET includes implementations of the Fast Infoset standard and several compression formats to choose from but it's commercial.
I am developing a little app that retrieves an XML file, located on a remote server (http://example.com/myfile.xml)
This file is relatively big, and it contains a big list of geolocations with other information that I need to use for my app.
So I read this file remotely once and insert it into a little SqlCE file (database.sdf)
So If I need to be accessing geolocation #1, I ll just make a SELECT statement into this DATABASE instead of loading the whole XML file every time.
But I would like to know if its possible to do this without using .sdf files?
What is the most efficient way (fastest)?
Saving the big XML file once locally and load if every time I start my app to load it in a data set? this is would make the app a bit long to load every time
Saving the big XML file once locally and reading the nodes one by one to look for geolocation #1?
Or is it possible to retrieve geolocation #1 from the remote XML directly(http://example.com/myfile.xml) without reading the whole file?
Load the big XML file, convert it into an appropriate different data structure, save it to a file in an efficient format. (XML really isn't terribly efficient.)
I believe Marc Gravell's Protocol Buffers implementation works on the Compact Framework...
(None of the protobuf implementations are deemed production-ready yet, but a couple are close. We need testers!)
Re protobuf-net, there isn't a separate download for the CF version at the moment, but there is a csproj in the source for both CF 2.0 and CF 3.5.
To clarify on your question; actually protobuf-net doesn't even use a .proto file (at the moment); a .proto file just describes what the data is - protobuf-net simply looks at your classes and infers the schema from that (similar to how XmlSerializer / DataContractSerializer etc work). So there is not .proto - just the classes that look like your data.
However, before you embark on creating classes that look like your data, I wonder if you couldn't simply use GZIP or [PK]ZIP to compress the data, and transfer it "as is". XML generally compresses very well. Of course, finding a GZIP (etc) implementation for CF then becomes the issue.
Of course, if you want to use protobuf-net here, I'll happily advise etc if you get issues...
The other option is for your CF app to call into a web-service that has the data locally...
Why would you pull the entire file down to the CE device for this? It's a bandwidth waste and certainly doing the lookup on an embedded processor is going to be way slower than on the server regardless of storage format. You should have a service (Web, WCF or whatever) that allows you to ask it for the single geolocation you want.