How can I improve the speed of loading xml documents

How can I improve the speed of loading xml documents - c#

I need to load 2 types of xml documents; one has 50 sub-children and the other has the same 50 and 800 additional ones. Performance is great with the smaller doc and acceptable with the larger doc until the number of children increases. 20k children * 50 sub-children = great performance, 20k children * 850 sub-children = slow performance. How would I skip looking for the extra descendants when they do not exist? My initial attempts lead me to think I need have separate classes, methods, viewmodels, and views for both the small and large docs. Below is a condensed look at my code.
public class MyItem
{
private string layout;
private string column;
private string columnSpan;
private string row;
private string rowSpan;
private string background;
public MyItem(string layout, string column, string columnSpan, string row, string rowSpan, string background)
{
Layout = layout;
Column = column;
ColumnSpan = columnSpan;
Row = row;
RowSpan = rowSpan;
Background = background;
}
public string Layout
{
get { return this.layout; }
set { this.layout = value; }
}
(Not Shown - Column, ColumnSpan, Row, RowSpan, and Background which are handled the same way as Layout)
Just for this example, below shows only 6 sub-children, I am looking for a way to load xml docs with only the first 2 sub-children. This way I can use whatever load method is required by both small or large xml docs.
internal class myDataSource
{
//Loads (MyList) xml file
public static List<MyItem> Load(string MyListFilename)
{
var myfiles = XDocument.Load(MylistFilename).Descendants("item").Select(
x => new MyItem(
(string)x.Element("layout"),
(string)x.Element("column"),
(string)x.Element("columnSpan"),
(string)x.Element("row"),
(string)x.Element("rowSpan"),
(string)x.Element("background")));
return myfiles.ToList();
}
public class MainViewModel : ViewModelBase
{
public void LoadMyList()
{
this.myfiles = new ObservableCollection<MyItemViewModel>();
List<MyItem> mybaseList = myDataSource.Load(MyListFilename);
foreach (MyItem myitem in mybaseList)
{
this.myfiles.Add(new MyItemViewModel(myitem));
}
this.mycollectionView = (ICollectionView)CollectionViewSource.GetDefaultView(myfiles);
if (this.mycollectionView == null)
throw new NullReferenceException("mycollectionView");
}
}
public class MyItemViewModel: ViewModelBase
{
private Models.MyItem myitem;
public MyItemViewModel(MyItem myitem)
{
if (myitem == null)
throw new NullReferenceException("myitem");
this.myitem = myitem;
}
public string Layout
{
get
{
return this.myitem.Layout;
}
set
{
this.myitem.Layout = value;
OnPropertyChanged("Layout");
}
}
(Not Shown - Column, ColumnSpan, Row, RowSpan, and Background which are handled the same way as Layout)

Instead of using Descendants, can you follow the direct path (i.e. use Elements)? That's the only way you'll keep from scanning nodes you know don't have items.

i think one thing you can do is not do a toList on the Select and keep it as lazy, and return an Iterable instead, or whatever Select returns.(sorry, i don't have a windows box right now to test this on). when you do the foreach, you will iterate over it only once (instead of twice right now)

XDocument is handy, but if the problem is simply that the files are large and you only have to scan once through, XmlReader might be the better choice. It doesn't read the entire file, it reads one node at a time. You can manually skip through parts you are not interested in.

Related

How can show my list and make it searchable? C#

Ok, i'm super new to this and this is for a schoolproject.
The project is to code a program where a person can store, update and search information.
In my program i make lists which store cloth information (brand, type, color, size) and i think my information gets stored but i don't know how access it / make a search function for it.
Is my code correct? Should i use another strategy?
This is where my list is defined(?!)
public class klädDATALIST
{
public string märke;
public string typ;
public string färg;
public string storlek;
public klädDATALIST(string _märke, string _typ, string _färg, string _storlek)
{
this.märke = _märke;
this.typ = _typ;
this.färg = _färg;
this.storlek = _storlek;
}
}
This is wehre the string variabels will be filled through a couple of Readline() functions.
For exampel:
string _färg = Console.ReadLine().ToUpper();
Then after i've saved it ill make a new list, i think?:
List<klädDATALIST> newklädDataList = new List<klädDATALIST>();
newklädDataList.Add(new klädDATALIST(_märke, _typ, _färg, _storlek));
I hope you can help me, thank you!

Elements can be accessed by iterating through the collection/List.
foreach( var item in newklädDataList)
{
// access or read item members.
Console.WriteLine(item.märke);
}
When you want to find an element in the List, you can either use Linq
var item = newklädDataList.FirstOrDefault(e=>e.märke == "searchstring"); //Any key to identify list item.
if(item != null)
{
Console.WriteLine(item.märke);
}
Or use Find
var item = newklädDataList.Find(e=>e.märke == "searchstring");
Hope this helps!

dynamic content control mapping for MS word c#

I am using code like this:
public void BindControlsToCustomXmlPart()
{
wordApp = (Word.Application)System.Runtime.InteropServices.Marshal.GetActiveObject("Word.Application");
foreach (Word.ContentControl contentControl in wordApp.ActiveDocument.ContentControls)
{
if (contentControl.Tag == "FieldName")
{
string xPathFieldName = "ns:records/ns:record/ns:FieldName";
contentControl.XMLMapping.SetMapping(xPathFieldName,
prefix, currentWordDocumentXMLPart);
}
What ends up happening is every new field I want to add, I have to repeat this redundant code:
if (contentControl.Tag == "FieldName2")
{
string xPathFieldName2 = "ns:records/ns:record/ns:FieldName2";
contentControl.XMLMapping.SetMapping(xPathFieldName2,
prefix, currentWordDocumentXMLPart);
}
Is there a way that I can write this code once and have the "FieldName" portion get updated for each field dynamically? i.e. have some type of loop that would increment through each xmlnode in an xml file (in this case it would map the xml node FieldName to the content control with a tag of FieldName, and then map the xml node FieldName2 to the content control with a tag of FieldName2

A good start would be creating a function to transform your control and reuse that function multiple times as followed
public contentControl BindControlsOperation(contentControl control, string pFieldName)
{
if (control.Tag == pFieldName)
{
string xPathFieldName = String.Format("ns:records/ns:record/ns:{0}",pFieldName);
control.XMLMapping.SetMapping(xPathFieldName,prefix, currentWordDocumentXMLPart);
}
return control;
}
You could then use it in the following fashion
foreach (Word.ContentControl contentControl in wordApp.ActiveDocument.ContentControls)
{
contentControl = BindControlsOperation(contentControl,"FieldName")
}
Next step would be to have a list of names you want to use for fields and feed it to your algorythm using a for loop
....
List<string> names = "x,y,z";
for(int i=0;i < names.length();i++)
{
wordApp.ActiveDocument.ContentControls[i] = BindControlsOperation(wordApp.ActiveDocument.ContentControls[i],name[i])
}
Hope this helps

Need to populate several labels from XML file, is there a faster way?

I have an XML file I am loading and breaking the document down into Ienumerable then putting each element into a label on a winform. sofar I have the following code, which works
public void PopulateGameBoard()
{
XDocument gameFiles = XDocument.Parse(Properties.Resources.Jeopardy);
IEnumerable<string> categories =
from category in gameFiles.Descendants("category")
select (string)category.Attribute("name");
string first = categories.ElementAt(0);
cat1HeaderLabel.Text = first;
string second = categories.ElementAt(1);
cat2HeaderLabel.Text = second;
string third = categories.ElementAt(2);
cat3Label.Text = third;
string fourth = categories.ElementAt(3);
cat4Label.Text = fourth;
string fifth = categories.ElementAt(4);
cat5Label.Text = fifth;
}
The final product is Jeopardy Game Board where the categories and questions will be pulled from an XML file
This is the first of 5 rows that I will need to do this with (5 lists going into 5 rows). I am wondering if there is a better way to code this where I dont end up with 25 statements assigning a variabel to an ElementAt() and then 25 assignments of that variable.

Here I tried to create labels dynamically and assign values to them,This is a hand written code,so no guarentee it will compile ,make necessary changes
public void PopulateGameBoard()
{
XDocument gameFiles = XDocument.Parse(Properties.Resources.Jeopardy);
IEnumerable<string> categories =
from category in gameFiles.Descendants("category")
select (string)category.Attribute("name");
Label[] cat1HeaderLabel= new Label[100];
int i = 0;
categories.Each(p =>
{
cat1HeaderLabel[i] = new Label();
cat1HeaderLabel[i].Text = p;
this.Form.Controls.Add(cat1HeaderLabel[i]);
i++;
});
}

Efficient algorithm for converting an array of depths into a Tree

I have a stored procedure that returns a flat list of names that are organized in a Tree. To communicate who is the parent of whom there is a Depth value, so a result of 5 records (going up to 3 levels) looks like this:
Depth|Name
----------
0|Ford
1|Compact Cars
2|Pinto
1|Trucks
2|H-Series
I am trying to construct a Tree out of this array by reading the depth values. Is there some obvious algorithm for constructing a tree out of a sequence of data like this? I'm adding the C# tag because I'm open to LINQy solutions to this problem though a generic Computer Science answer would be extremely helpful.
Here is my current attempt:
class Record
{
public string Name{ get; set; }
public List<Record> children { get; set; }
}
var previousLevel = 0;
var records = new List<Record>();
foreach (var thing in TreeFactory.fetch(dao))
{
if(this.Depth == 0) {
//Root node
} else if(thing.Depth > previousLevel) {
//A Child of the last added node
} else if(thing.Depth < previousLevel) {
//A Cousin of the last added node
} else {
//A Sibling of the of the last added node
}
previousLevel = this.Depth;
}
By "efficient" I'm talking List sizes up to 200,000 elements and trees that extend up to 100 levels, so really I'm just looking to for something that is easier to reason about.

Recursion is unneeded here. I believe the fastest way would be this:
public static TreeView TreeFromArray(Item[] arr)
{
var tv = new TreeView();
var parents = new TreeNodeCollection[arr.Length];
parents[0] = tv.Nodes;
foreach (var item in arr)
{
parents[item.Depth + 1] = parents[item.Depth].Add(item.Name).Nodes;
}
return tv;
}
Item is anything that has the Depth and Name information:
public class Item
{
public int Depth;
public string Name;
}
When using my own implementation of TreeNode, to simplify the procedure and strip it from unneeded functionalities that slow the whole thing down, and altering the method a little bit to suit thsoe changes, I came up with this:
Classes:
public class Node
{
public string Name;
public List<Node> Childs = new List<Node>();
}
public class Item
{
public int Depth;
public string Name;
}
Implementation:
public static Node TreeFromArray(Item[] arr)
{
var tree = new Node();
var parents = new Node[arr.Length];
parents[0] = tree;
foreach (var item in arr)
{
var curr = parents[item.Depth + 1] = new Node {Name = item.Name};
parents[item.Depth].Childs.Add(curr);
}
return tree;
}
Results:
With the given data: 1,000,000 times in 900 milliseconds

public void AddNode(Tree tree, Node nodeToAdd, int depth)
{
//you might need to add a special case to handle adding the root node
Node iterator = tree.RootNode;
for(int i = 0; i < depth; i++)
{
iterator = iterator.GetLastChild(); //I assume this method won't exist, but you'll know what to put here
}
iterator.AddChild(nodeToAdd);
}
It's kinda pseudocode-y. It doesn't add error handling, and I pretend methods exist for sections of code I imagine you could figure out on your own.

This array looks like a left-to-right "flattening" of an original tree structure. If it is safe to assume that, then the method is simple:
For each element in the array
If the depth of the element is less than or equal to the "current node"
traverse upwards to the parent until current depth = element depth -1
Create a child node of the current node
Traverse to that node as the new "current" node

Ingredients:
A class that allows for child nodes.
A stack of this class type.
An array or list of this class type.
An int to store the current depth.
A local to hold our current item of interest.
Method.
Take your first item from the list. Put it in the stack. Store 0 as the current depth.
Take each remaining item from the list in turn. Look at its depth. If the depth is equal to or less than the current depth pop off (currentdepth - depth + 1) items from the stack, and the peek in the stack to get the new current item. Make our new item a child of the "current item", make it the current item, make its depth the current depth.
Bake in a compiler at 200°C or Gas-mark 6 for 300 milliseconds, or until golden brown.

XML data file not opening and working properly

I developed a WPF application using XML as the database file. Yesterday, the program stopped working. After some checking, I saw that there was a problem with Transaction.xml file. I tried opening the same in IE, but got this error
The XML page cannot be displayed
Cannot view XML input using style sheet. Please correct the error and then click the Refresh button, or try again later.
An invalid character was found in text content. Error processing resource 'file:///C:/RegisterMaintenance/Transaction.xml
Then, I tried opening the file in notepad and it showed weird character(screenshot below).
In the end, its displaying the right structure of xml. Please tell me what has gone wrong and why the xml not showing correctly. How can get it to normal state. I am really worried as this is my only data file. Any help or suggestion will be great.
One of the codes that edit this file, there are other similar types of code files that use Transaction.xml
public string Add()
{
XDocument doc1 = XDocument.Load(#"Ledgers.xml");
XElement elem = (from r in doc1.Descendants("Ledger")
where r.Element("Name").Value == this.Buyer
select r).First();
this.TinNo = (string)elem.Element("TinNo");
this.PhoneNo = (string)elem.Element("PhoneNo");
this.CommissionAmount = (this.CommissionRate * this.Amount) / 100;
this.CommissionAmount = Math.Round((decimal)this.CommissionAmount);
this.VatAmount = (this.CommissionAmount + this.Amount) * this.VatRate / 100;
this.VatAmount = Math.Round((decimal)this.VatAmount);
this.InvoiceAmount = this.Amount + this.CommissionAmount + this.VatAmount;
XDocument doc2 = XDocument.Load(#"Transactions.xml");
var record = from r in doc2.Descendants("Transaction")
where (int)r.Element("Serial") == Serial
select r;
foreach (XElement r in record)
{
r.Element("Invoice").Add(new XElement("InvoiceNo", this.InvoiceNo), new XElement("InvoiceDate", this.InvoiceDate),
new XElement("TinNo", this.TinNo), new XElement("PhoneNo", this.PhoneNo), new XElement("TruckNo", this.TruckNo), new XElement("Source", this.Source),
new XElement("Destination", this.Destination), new XElement("InvoiceAmount", this.InvoiceAmount),
new XElement("CommissionRate", this.CommissionRate), new XElement("CommissionAmount", this.CommissionAmount),
new XElement("VatRate", this.VatRate), new XElement("VatAmount", this.VatAmount));
}
doc2.Save(#"Transactions.xml");
return "Invoice Created Successfully";
}

C# is an Object Orient Programming (OOP) language, perhaps you should use some objects! How can you possibly test your code for accuracy?
You should separate out responsibilities, an example:
public class Vat
{
XElement self;
public Vat(XElement parent)
{
self = parent.Element("Vat");
if (null == self)
{
parent.Add(self = new XElement("Vat"));
// Initialize values
Amount = 0;
Rate = 0;
}
}
public XElement Element { get { return self; } }
public decimal Amount
{
get { return (decimal)self.Attribute("Amount"); }
set
{
XAttribute a = self.Attribute("Amount");
if (null == a)
self.Add(new XAttribute("Amount", value));
else
a.Value = value.ToString();
}
}
public decimal Rate
{
get { return (decimal)self.Attribute("Rate"); }
set
{
XAttribute a = self.Attribute("Rate");
if (null == a)
self.Add(new XAttribute("Rate", value));
else
a.Value = value.ToString();
}
}
}
All the Vat data will be in one node, and all the accessing of it will be in one testable class.
Your above foreach would look more like:
foreach(XElement r in record)
{
XElement invoice = r.Add("Invoice");
...
Vat vat = new Vat(invoice);
vat.Amount = this.VatAmount;
vat.Rate = this.VatRate;
}
That is readable! At a glance, from your code, I cannot even tell if invoice is the parent of Vat, but I can now!
Note: This isn't to say your code is at fault, it could be a hard-drive error, as that is what it looks like to me. But if you want people to peruse your code, make it readable and testable! Years from now if you or someone else has to change your code, if it isn't readable, it is useless.
Perhaps from this incident you learned two things
read-ability and test-ability.
Backups! (All my valuable Xml files are in a SVN (TortoiseSVN) so I can compare what has changed, as well as keeping good backups. The SVN is backed-up to online storage.)
An ideal next step is to take the code in the property setters and refactor that out to a static function extension that is both testable and reproducable:
public static class XAttributeExtensions
{
public static XAttribute SetAttribute(this XElement self, string name, object value)
{
// test for correct arguments
if (null == self)
throw new ArgumentNullException("XElement to SetAttribute method cannot be null!");
if (string.IsNullOrEmpty(name))
throw new ArgumentNullException("Attribute name cannot be null or empty to SetAttribute method!");
if (null == value) // how to handle?
value = ""; // or can throw an exception like one of the above.
// Now to the good stuff
XAttribute a = self.Attribute(name);
if (null == a)
self.Add(a = new XAttribute(name, value));
else
a.Value = value.ToString();
return a;
}
}
That is easily testable, very readable and the best is it can be used over and over again getting the same results!
Example, the Amount property can be greatly simplified with:
public decimal Amount
{
get { return (decimal)self.Attribute("Amount"); }
set { self.SetAttribute("Amount", value); }
}
I know this is a lot of boiler-plate code, but I find it readable, extendable and best of all test-able. If I want to add another value to Vat, I can just modify the class and not have to worry about have I added it in the right place. If Vat had children, I'd make another class that Vat had a property for.

The .xml is clearly malformed. No browser or other program that reads xml files will be able to do anything with it. It doesn't matter that the xml starts being correct after some lines.
So the error is most certainly it whatever creates and/or edits your xml file. You should have a look there. Maybe the encoding is wrong. The most used encoding is UTF-8.
Also, as a side note, XML is not really the best format for large databases (too much overhead), so switching to a binary format would be best. Even switching to JSON would bring a benefit.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

How can I improve the speed of loading xml documents - c#

Instead of using Descendants, can you follow the direct path (i.e. use Elements)? That's the only way you'll keep from scanning nodes you know don't have items.

i think one thing you can do is not do a toList on the Select and keep it as lazy, and return an Iterable instead, or whatever Select returns.(sorry, i don't have a windows box right now to test this on). when you do the foreach, you will iterate over it only once (instead of twice right now)

XDocument is handy, but if the problem is simply that the files are large and you only have to scan once through, XmlReader might be the better choice. It doesn't read the entire file, it reads one node at a time. You can manually skip through parts you are not interested in.

Related

How can show my list and make it searchable? C#

dynamic content control mapping for MS word c#

Need to populate several labels from XML file, is there a faster way?

Efficient algorithm for converting an array of depths into a Tree

XML data file not opening and working properly

Categories

Resources