Saving file system metadata

Saving file system metadata - c#

I want to save all the metadata connected to a file system, but not the "useful" data. The metadata should be available for viewing even when the original files aren't.
I first thought that I could accomplish this by serializing for example a DirectoryInfo object, but I now understand that the object doesn't actually save the data but rather merely saves the path and accesses the file itself when the methods are called. Thus serialization would be worthless, since the deserialized object would look for the file instead of "remembering" the metadata.
So: is there some kind of built in framework class for doing this or should I just implement it myself?

This object is an object hierarchy so it could get a bit tricky to serialize? You might try creating an a simple object to model the data you want to save. You could then use AutoMapper to copy the data over into the DTO-like object and then serialize that. This way if you wanted to actually persist the entire tree of data you could without writing much code.

Related

Store a very complex object with circular references to avoid needing a Singleton in my web app

Currently I generate a singleton object for my site which is created from a flat file. The object is never changed by the application, it is purely used as reference by some of the functions on my site. It effectively describes the schema for another file type we use, similar to XML & XSD
The singleton object generated is fairly large (5000+ child objects, with up to 500 properties each) and it contains circular references as child objects can reference the parent object due to 2 way references.
This all works fine currently, however the first time the app loads, it takes over a minute to generate the singleton. Which is the reason I am using a singleton here, so I don't have to regenerate it every request. But it also means every time the app pool restarts, the first request takes well over a minute to load. Every consecutive request is fast once the object is in memory.
Seeing that the flat file rarely changes, I would like to find a good way to generate the object once and store the object in way that I can quickly retrieve it when needed.
I tried serializing the object to json, and storing it in the database, however due to the circular references Json.net fails or I end up losing information if I configure it to ignore the references when serializing.
Are there any better ways of handling such an object better, or am I stuck to using the singleton for now?

Given the static nature of this object, serialization would one option.
The circular reference issue you mention with Json.Net can be easily remedied using appropriate JsonSerializerSettings (refer to this answer)
If speed is of the essence, than you may want to investigate other serialization options (netserializer claims to be one of the fastest).
Ultimately though, you should look to put the file / object structure into some sort of cache that sits outside of the app pool (Redis perhaps), or even load the flat file's data into a well designed database schema (i.e. parent - child relationships etc).
Creating a massive object graph is rather inefficient and will potentially create memory issues.

Saving data into DB s considered Serialization?

Serialization -> Convert an object to a binary representation that can be then written to Disk or write on a file..
Above is the basic definition of serialization that I know. But what does this really mean? I have a class in my application and I use this to get data from user and store it in Database. Does this mean I am using serializion here? Even storing the data is more like saving the state of the object, I can get the data and form the same object once again.
Can any one light me up with whats a real serialization? If serialization is not used what will be the result? Whats the difference between saving the data in a file and doing the serialization (to save the data) in a file.

I doubt storing data in a database should be considered serialization. Even when you're storing the data coming from your object-oriented programming layer, actually you're translating objects into the relational world and viceversa. This is called data-mapping.
Perhaps you may argue performing an INSERT is storing data in an interoperable format. Not necessarily, since SQL is a domain-specific language to manage relational data, and you don't know how the data is actually stored either in memory or disk. SQL itself isn't a serialization format.
Since most databases are on disk, you can consider serialization the process of persisting database registers to disk in order to retrieve or alter them afterwards, and use RAM to optimize reads and writes without carrying the entire database to memory.
In the other hand, serialization can be done in binary or non-binary formats. For example, you can serialize an object into JSON, and JSON isn't a binary format. Also, XML it has been used as serialization format for years and it's not binary.
A good definition to serialization may be: consider serialization when some in-memory object is turned into an interoperable representation that can be stored in disk or transmitted over the wire to easily get back it as in-memory object in any platform and language being capable of understanding the serialization format.
Examples:
A REST API sending a list of users as data-transfer objects serialized to JSON.
An application lets user visually edit its configuration and settings. When UI needs to show current values, it will deserialize the configuration back to objects to bind them to the UI, and once the user presses Save, configuration gets serialized again to disk.
An application provides its own backup. The backup can be the entire object graph serialized as JSON.

Load config file data automatically into class properties

I want to load values from my configuration file into the Properties of specific classes automatically.
I think about it many hours but I don't find a good solution on my own.
Create a BaseClass with the default constructor. So the default constructor can search for property-names which are in the config file, too. This make only sense with Entities (which only hold data). If I want to use this way for normal classes, I can not use other inheritance.
Create a factory which fill the properties. Possible too, but I dont want to use the Factory everytime. This is not automatic enough.
Class Attributes? Can I access the object out from the attribute, if I use a class attribute?
How do you do it in your applications? Which way (do you know better/other ways?) are the best for filling properties automatically?
Edit
I will try to explain it a little bit more. I have an application with many configuration data, that I store in a xml file. For example something like camera specific data, image processing options, which sps type is used and so on.
If I want put this data to the right class I have to pass through this data over and over again. Further I have to write the same code (assign value to property).
So I want a solution which make it "magically" self.

Maybe you are looking for Serialization: From Wikipedia:
In computer science, in the context of data storage, serialization is the process of
translating data structures or object state into a format that can be stored (for
example, in a file or memory buffer, or transmitted across a network connection link) and
reconstructed later in the same or another computer environment.
http://en.wikipedia.org/wiki/Serialization
So, you could just create your Configuration-class that has variour properties for the different configuration values. You then serialize that class to a file ('save the configuration') and de-serialize the file ('load the configuration').
Like this you do not need to worry about the mapping of class-properties to file-properties.
A lot of languages have already built-in helper to serialize an Object. For example:
http://msdn.microsoft.com/en-us/library/4abbf6k0%28v=vs.110%29.aspx

C#: Save serialized object to the text field in the DB, how maintain object format versions in the future?

I've complex object (nested properties, collections etc) in my ASP.NET MVC C# application. I don't need to save it into the multiple tables in the DB, serializing the whole object and store it like a whole is ok.
I plan to serialize the whole object (in something human-readable like JSON/XML) and store in text field in the DB.
I need to later load this object from the DB and render it using strongly-typed view.
Here comes the question: in the future the class of the object can change (I can add\remove fields etc). But serialized versions saved into the DB before will not reflect change.
How to deal with this?

You should write some sort of conversion utility every time you significantly change structured, serialized messages, and run it as part of an upgrade process. Adding or removing fields that are nullable isn't likely to be a problem, but larger structural changes will be.
You could do something like implement IXmlSerializable, peek at the message and figure out what version the message is and convert it appropriately, but this will quickly become a mess if you have to do this a lot and your application has a long lifecycle. So, you're better off doing it up front in an upgrade process, and outside of your application.
If you are worried about running the conversion on lots of records as part of an upgrade, you could come up with some ways to make it more efficient (for example, add a column to the table that contains the message schema version, so you can efficiently target messages that are out of date).

As long as you're using JSON or XML, added fields shouldn't be a problem (as long as no specific version schemas are enforced), The default .net XML serializer for instance, doesn't include fields that have their default value (which can be set with the System.Component.DefaultValue attribute). So the new fields will be treated the same as the omitted fields while deserializing and get their default values (default class values that is, the DefaultValue attribute only applies to serialization/designer behaviour).
Removed fields depends on your deserialization implementation, but can be made so that those are ignored. Personally I tend to keep the properties but mark them as obsolete with a message of what they were once for. That way when coding you'll know not to use them, but can still be filled for backwards compatibility (and they shouldn't be serialized when marked obsolete). When possible you could implement logic in the obsolete property that fills the renewed data structure.

Most efficient way to store an object via serialization

This is more of a design question, I'm looking for a good approach:
I have an Object which consists of a few properties (some Integers and a byte[] array).
I'm using BinaryFormatter to serialize my objects - I'm holding a List<T> of all objects at any given time.
When the application starts up, I de-serialize the file in which the objects were de-serialized to.
When the application closes, I serialize the whole List<T> and save up everything to the file.
My problem is: In case of a system failure, the objects I hold in my List<T> will obviously get lost, since I serialize the List<T> only when the application shuts down normally.
I'm not looking into de-serializing & serializing each time I want to insert an object to my List, since that will be very expensive.
The solution I thought of is to hold a local database with a BLOB column to which the Objects will be serialized to, but I'm not too sure of this approach.
Any thoughts would be appreciated!!

Only deserialize when the application is started.
You need to serialize every time a new item is added (or use the UnhandledException event) if you want to make sure that no items are lost if the application crashes.
If not, I would use a background thread to serialize the list when new items are added and a serialization from the main thread when the application is exited.
The solution I thought of is to hold a local database with a BLOB column to which the Objects will be serialized to, but I'm not too sure of this approach.
I don't see any benefits of using a database. It will actually be slower than serializing everything to a file.

I would add simple ORM to, let's say Sqlite database and would avoid standart binary serialization. I, personally, don't like it too much as in case of changes applied to the object serialized before (like add/remove properties, change types, function parameters or its sequence) will lead to deserialization failure. In other words it's not scallable, imo.

My approach would be binlogging:
While the application runs, if you delete an object from the list
just write out a notice of that to a file. If you add an object to
the list, serialize only this one and write it out again. (All the
way using filenames such as change-00000000001.del,
change-00000000002.add etc.)
When the application shuts down, after final serialization delete all change-* files
On startup after deserialization (old state), check for change-* files: If they exist, this was a crash and we have to work them up, then do a new serialization and delete changefiles

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.