Fastest Way of reading XML - c#

I have an XML file in which I store data about a list of persons and another one in which I store a list of objects like this.
people.xml
<People>
<Person>
<Name>itsName</Name>
<Age> itsAge </Age>
<RecentAcquisitions>
<Acquisition>
<name>Apple</name>
<quantity>5</quantity>
</Acquisition>
</RecentAcquisitions>
</Person>
</People>
objects.xml
<Objects>
<Object>
<Name>Apple</Name>
<Description>Fresh Apple</Description>
<Price>10</Price>
<etc>..lots of attributes..</etc>
</Object>
</Objects>
What is the most efficient way of extracting information from objects.xml based on the person Acquisition List at the runtime? (in example the person should have 5 objects of type "Apple").
Momentarily I use a solution which consists of storing each object from objects.xml in a list and when I'm loading a person I search for the respective object based on Acquisition->Name and add it in the person.AquisitionList;
Is there another way of doing this?
Maybe I misunderstood the XML role but it feels wrong to store the information from an XML file in a list or array at runtime.

to my knowledge, using the runtime memory instead of constant read-write operations is the best way to do it / what you're doing is the right way.
XML can be seen as 2 things:
1 - A way to store information, much like a database, until it needs to be retrieved for processing at runtime
this is what you are doing now... you store the objects list on disk using XML, and then you retrieve it for processing/load it into memory at runtime.
2 - A standardized way of passing information around, regardless of technology.
XML can be read in a multitude of languages and any language that can read a string can technically read and extract the data from an XML document.

Related

Deserialization with options

I am currently achieving serialization of a collection to the file. The results are like how I expect
<Persons>
<Person>
<Identity>1234</Identity>
<Name>asd</Name>
</Person>
<Person>
<Identity>12345</Identity>
<Name>asdd</Name>
</Person>
</Persons>
Now, I don't want to deserialize the whole collection but I want to deserialize an object from the file with some specific options. For example,
object GetPersonWithIdentity(int identity )
{
// what to do here
}
object asd = GetPersonWithIdentity(1234);
// expected Person with Identity "1234" and Name "asd"
Is it be reasonable to deserialize whole collection and find the specific object and return it, or is there any other solution for this?
XML is not seekable so you at least have to read forward till you find the first match. The framework does not support that automatically so you have to do it manually using an XmlReader which is laborious.
If the file is small and/or performance is not an issue, just deserialize everything and be done with it.
If your dataset is big I'd consider moving to some more scalable format like an embedded SQL database. SQL databases have this capability inherently.
You will have to serialize the entire XML file because as usr mentioned, XML is forward only. XmlReader/Writer is essentially a TextReader (Stream) object, and is doing file i/o to read an XML file.
The reason for my seperate answer is I would do the following using an XDocument object:
object GetPersonWithIdentity(int identity )
{
return myXDocumentVaraible.Descendants("Person")
.First(person => person.Element("Identity").Value == identity.ToString());
}

Serialize only a part of xml file and save it

I have a problem with respect to XML Serialization. I shall try to explain it with the following example xml file
<AutoExpo>
<Details>
<Venue>XYZ</Venue>
<StartTime>09:00</StartTime>
<EndTime>21:00</EndTime>
</Details>
<Cars>
<Car>
<Company>Chevrolet</Company>
<Model>Cruz</Model>
<Color>Red</Color>
</Car>
<Car>
<Company>Ford</Company>
<Model>Fiesta</Model>
<Color>Blue</Color>
</Car>
</Cars>
</AutoExpo>
Now, when I read this xml file, I deserialize the cars into objects. The car list can be huge. My code uses this objects and can change the properties of some cars. Now what if I want to serialize only those car objects whose properties have changed, back to the xml file and save it so that next time when my code starts it gets the latest state information.
It would be quite difficult to jump around in the XML file changing properties here and there, wherever they have changed. You should just read the whole file into memory, and when you save, write out the whole thing, overwriting the old file.
XML isn't a terrible way of doing this, but as far as I can tell from the question, a SQL Server (or other RDBMS) database would be much more appropriate. You won't have to worry about issues like this, as the DB engine will do that for you.
Although it may not be the best solution, a potentially viable option would be to serialize the edited list to a seperate file and, in code, compare the two files. If there hasn't been any changes to the information the two text files should be identical. If not, you can replace the old file with the new file. The easiest way would be, rather than serialize to a file and read/write it, perhaps send it to a stream and compare them.
When you serialize an object, it generates the entire XML document. So, if you save that to a file, it will overwrite the previous content of the file. Therefore, if you want the resulting file to contain all the cars, including, but not limited to, the modified ones, then you need to serialize the whole thing. If you only serialized the ones that changed, the file would lose all the cars that did not change. If you really do only want to serialize the changed cars, I would suggest creating a new instance of the AutoExpo object and only insert into it the cars that you want to save, then serialize that object with only the partial list.
If you need to just modify a single element in the XML without touching the rest of it because the data is too big, XML is not a good choice. I would suggest a relational database instead. Alternatively, you could store each car as its own XML file and only load and save each one individually as necessary.
You cannot do that with XML. Consider using a relational database. Relational databases have a built-in file space management mechanism allowing doing exactly what you need. You can update single records, add and delete records.
A Jet .mdb database (Access) is a good candidate for the replacement of a XML-File. You can access it via OLEDB with the restriction that the application must be compiled for 32 bit. Access needs not to be installed.
First of all, your entities must have unique identifiers.
<AutoExpo>
<Details>
<Venue>XYZ</Venue>
<StartTime>09:00</StartTime>
<EndTime>21:00</EndTime>
</Details>
<Cars>
<Car id="1">
<Company>Chevrolet</Company>
<Model>Cruz</Model>
<Color>Red</Color>
</Car>
<Car id="2">
<Company>Ford</Company>
<Model>Fiesta</Model>
<Color>Blue</Color>
</Car>
</Cars>
</AutoExpo>
Now you could use XPath to select those nodes that require updates and change their content.
load the document into an XDocument
find a car: document.Element("Car[id=2]")
set the new value: element.Element("Color").Value = "Black"
However, the downside of using a file-based storage remains. You still have to load the whole file into memory and write it back to the hard drive when you're down updating, but you do not have to serialize all Car objects.
I can't think of an easy way to stream the file from hard drive and manipulate it in one go.

Nesting XML and using DataSets etc

I have a nested XML document and I am looking at the plausability of using DataSets to parse it.
<?xml version="1.0" standalone="yes"?>
<Workbench>
<Overrides>
<Override name='firstoverride' value='overridevalue'/>
</Overrides>
<DataSets>
<BASIC>
<MEMBNO>1</MEMBNO>
<PERSONNO>0</PERSONNO>
</BASIC>
</DataSets>
</Workbench>
What i want to be able to do is essentially access the contents of the Overrides and DataSets as if there were an actual Dataset.
So to validate I check the root element is workbench.
Then I check to see if there are any overrides, I then want to be able to iterate around the Override Items.
Following that, and this is the hard part I want to support abitary but well form XML that will be inserted into a database but the parsing code can make so assumptions about the data as I want it to be generic.
I can do this if I make the DataSets the root element and iterate around it but it doesn't seem to work if nested?
hlep!

How do I use LINQ to parse XML with child elements of the same name?

Background Information:
In the past, I have been picking up a collection of XML files and iterating through each XML file, parsing it, passing string data to a data transfer object and passing the object into a database.
Before, my XML looked like this.
<messages>
<message>
<title>Red Wall</title>
<summary>This is a good article</summary>
<ISBN>13546846545464</ISBN>
</message>
</messages>
Here, I only have one element. So, I would parse the XML by using LINQ and retrieve the subsequent elements(title, summary, isbn). Then I would initialize/instantiate an object, assign its properties to the values I retrieved, and send it along.
Now my XML looks like this:
<messages>
<message>
<title>Red Wall</title>
<summary>This is a good article</summary>
<ISBN>13546846545464</ISBN>
</message>
<message>
<title>Blue Wall</title>
<summary>This is not a good article</summary>
<ISBN>15648465416</ISBN>
</message>
</messages>
I now have two (or more) elements in my XML file, and for each one I need to 1) identify that there are multiple elements and 2) for each create a separate DTO to hold the data that I parse.
My question is: How do I parse XML with multiple tags and identify each one I encounter as being separate from the other?
Final Note: While parsing, I need to be able to instantiate a DTO to capture the information I get returned back.
Thanks for helping!
Just Grab the element you want and use your select to populate the dto from the child items. Something like this not tested
XElement ele = loaded.Element("messages");
dtos = from item in ele.Descendants("message")
select new DTO() {title = item.Element(title).value ,... };
The select statement above is going to return an IEnumerable<DTO>, which is a sequence of DTO objects. For every message node it finds in the XML, it will create a DTO object and add it to the sequence returned. If your goal is just to iterate over all the DTOs, you're already there. If you actually need a List<DTO>, there is a constructor on the generic List object that takes an IEnumerable<T>, so you could pass in the "dtos" you received from the select statement and have a List.

Class Design related clarification c#

I have a design related question. So please guide me on how to do this?
Right now, I have a xml structure which will be in this format.
<Sheet id="abc"/>
<Elements>
<Element id="xyz">
<value>23</value>
</Element>
<Element id="sdz">
<value>46</value>
</Element>
...
</Elements>
</Sheet>
So, we have a class like this to store each & every element of the sheet.
Sheet
{
Public string SheetId
{
get; set;
}
//all Elements will be stored in the below collection
Public IList<IDictionary<string, string>> Elements
{
get; set;
}
}
But now for few sheets the format has been changed to the below structure.
<abc> //SheetId
<Elements>
<Record>
<Element1/>
<Element2/>
<Element3/>
<Element4/>
</Record>
<Record>
<Element1/>
<Element2/>
<Element3/>
<Element4/>
</Record>
...
</abc>
So, we need to create a Generic class to hold the above xml formats and we don't want to have different object to store these.
I mean in future if we have some other xml format also we need to accommodate the same without any change in the Sheet class.
So, can you please advise me on how to design the Sheet Class.?
Ok. let me explain how my app works.
Actually we have around 200 sheets(in other words measures).
1) User will upload the sheet data in xml format (all sheets in xml file) & edit the same if they want Or Enter the data in the screen (dynamic screen generated using the xml template) if they dont want to upload.
2) Then the data will be stored in the Sheet object and it will go through lot of Validation process and finally the data will be converted to xml again and stored in the db.
You can ask why you want to store this as XML? The reason is we dont want to create 200 aspx pages for this same kind of data and thats why we are generating the sheet pages dynamically using the xml templates. Also, we will be adding, updating or deleting sheets frequently.
Ok. I think now you will have some idea about this issue.
To be more clear, all the elements in my XML file will be displayed as a field in the aspx page. It maybe a Textbox, dropdown & etc....
I would recommend designing your class based on what the information actually represents and how your software plans to utilize the data, not the XML format being used.
You should always be able to do the transposition from the data format into the structure which best represents how your program will use this data.

Categories

Resources