Extracting data from the properties of an xml file

Extracting data from the properties of an xml file - c#

I am attempting to extract data from an xml file generated from a save function. Here is what the xml looks like when the data has been serialized
<Data>
<ParentID>00000000-0000-0000-0000-000000000000</ParentID>
<Content><ContentControl xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation"> <Grid><Image Source=".//Resources/Images/start.png" Tag="Start" ToolTip="Start" IsHitTestVisible="False" /></Grid></ContentControl> </Content>
</Data>
I can read the data between the <> signs using an XElement object and extract it value using Element("Child").Value for example the ParentID but I do not know how to extract the property data from within Content tags such as the programmatic reading the Tag property of the Image, in this case Tag='Start'.
Can someone please assist me to resolve this matter

If the problem you are running into is that the data in the Content node is a malformed fragment, then this is a way to extract that, fix the malformation and get at the data.
string asReadXml = #"<Data>
<ParentID>00000000-0000-0000-0000-000000000000</ParentID>
<Content><ContentControl xmlns=""http://schemas.microsoft.com/winfx/2006/xaml/presentation""> <Grid><Image Source="".//Resources/Images/start.png"" Tag=""Start"" ToolTip=""Start"" IsHitTestVisible=""False"" /></Grid></ContentControl> </Content>
</Data>";
var fragment = Regex.Match(asReadXml, #"(?:\<Content\>)(?<Xml>.+)(?:\</Content\>)", RegexOptions.ExplicitCapture).Groups["Xml"].Value;
var validFragment = Regex.Replace(Regex.Replace(fragment, "(<)", "<"), "(>)", ">");
var xDoc = XDocument.Parse("<Root>" + validFragment + "</Root>");
/* XDoc looks like this:
<Root>
<ContentControl xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation">
<Grid>
<Image Source=".//Resources/Images/start.png" Tag="Start" ToolTip="Start" IsHitTestVisible="False" />
</Grid>
</ContentControl>
</Root>
*/
var Image =
xDoc.Root
.Descendants()
.Where (p => p.Name.LocalName == "Image")
.First ();
Console.WriteLine ( Image.Attribute("Tag").Value );
// Outputs
// Start

var data = #"<Data>" +
"<ParentID>00000000-0000-0000-0000-000000000000</ParentID>" +
"<Content><ContentControl xmlns=\"http://schemas.microsoft.com/winfx/2006/xaml/presentation\">"+
"<Grid><Image Source=\".//Resources/Images/start.png\" Tag=\"Start\" ToolTip=\"Start\" IsHitTestVisible=\"False\" /></Grid></ContentControl>" +
"</Content>" +
"</Data>";
var root = XElement.Parse(data);
var contentValue = root.Element("Content").Value;
var contentXml = XElement.Parse(contentValue);
var ns = contentXml.Name.Namespace; // retrieve the namespace
var imageTagValue = contentXml.Element(ns+"Grid").Element(ns+"Image").Attribute("Tag").Value; //

Assume that element is an XElement object that represent <Content> element (You already have a way to get it though), you can do as follow to get Tag attribute value of Image element :
XElement element = ....;
var content = XElement.Parse((string)element);
var ns = content.Name.Namespace;
var image = content.Descendants(ns + "Image").FirstOrDefault();
var tag = "";
if(image != null)
{
tag = (string)image.Attribute("Tag");
}
We check if image is null before looking for it's attribute. With that, you won't get exception if there any <Content> element that doesn't have <Image> element). tag variable will simply contains empty string in that case.
This also handle case when <Content> has <Image> element resides in different path (not under <Grid> element).

Personally, I would recommend getting the whole content as a string, and then parse it as a html data using http://htmlagilitypack.codeplex.com/ library. That way you'll offload all the parsing to specialized libraries.

Related

Retrieve values from XML File

I have Multiple XML Files that look like below
<?xml version="1.0" encoding="UTF-8"?>
<schema>
<sp_transaction_id name="sp_transaction_id" value="1" />
<sp_year name="sp_year" value="2015" />
<sp_first_name name="sp_first_name" value="James" />
<sp_gender name="sp_gender" value="Male" />
<sp_date_of_birth name="sp_date_of_birth" value="06-06-1999" />
</schema>
The XML Format i think is in Key-Value Pairs.
I want to extract these values and store it into a database(SQL Server 2012) table, with the name(eg; sp_year) as Column Name and value(eg; 2015) as the Column value using ASP.NET C#.
I think i can upload the file and read it like this :
string fileName = Path.GetFileName(FileUpload1.PostedFile.FileName);
string filePath = Server.MapPath("~/Uploads/") + fileName;
FileUpload1.SaveAs(filePath);
string xml = File.ReadAllText(filePath);
But thats pretty much it ( Sorry Im a beginner ). Please Guide me. Thanks

You can use the following code to get the key value pairs
XDocument doc = XDocument.Load(filePath);
var schemaElement = doc.Element("schema");
foreach (var xElement in schemaElement.Elements())
{
Console.WriteLine(xElement.Attribute("name").Value + ":" + xElement.Attribute("value").Value);
}
Elements method returns all elements inside schema element.
However I suggest changing xml file to this format, if possible
<?xml version="1.0" encoding="UTF-8"?>
<schema>
<KeyValuePair name="sp_transaction_id" value="1" />
<KeyValuePair name="sp_year" value="2015" />
<KeyValuePair name="sp_first_name" value="James" />
<KeyValuePair name="sp_gender" value="Male" />
<KeyValuePair name="sp_date_of_birth" value="06-06-1999" />
</schema>

For reading data from an xml file you don't need to upload it.You can give path of xml and read from it.You can use following method to read from xml
public static XmlDocument LoadXmlDocument(string xmlPath)
{
if ((xmlPath == "") || (xmlPath == null) || (!File.Exists(xmlPath)))
return null;
StreamReader strreader = new StreamReader(xmlPath);
string xmlInnerText = strreader.ReadToEnd();
strreader.Close();
XmlDocument xmlDoc = new XmlDocument();
xmlDoc.LoadXml(xmlInnerText);
return xmlDoc;
}
For reading data from xml you can use
XmlDocument xmlDoc = LoadXmlDocument(xmlPath);
XmlNodeList nodes = xmlDoc .SelectNodes("//*");
foreach (XmlElement node in nodes)
{
.
.
.
}
In foreach loop you can get your required values e.g sp_year

The answer below shows how to create an XmlDocument from it.
I would suggest to use it as a User-Defined class if you know whether Xml schema change.
First of all you should create POCO class appropriate to Xml file schema using XmlAnnotations.
Secondly:
Having path of the file:
XmlSerializer serializer = new XmlSerializer(typeof(definedclass));
using (FileStream fs = File.Open(pathtofile))
using (XmlReader reader = XmlReader.Create(fs))
{
var xmlObject = serializer.Deserialize(reader);
}
xmlObject is now your user-defined class with values from xml.
Regards,
Rafal

You can load the files into an XDocument & then use Linq-To-XML to extract the required information. The example code below loads all name/value pairs into an array of class :
class MyXMLClass
{
public String FieldName { get; set; }
public String Value { get; set; }
}
The code gets all "schema" descendants (just one as it is the top level element), then selects all elements inside the & creates a new class object for each extracting the name & value.
XDocument xd = XDocument.Load("test.xml");
MyXMLClass[] xe =
xd.Descendants("schema")
.Elements()
.Select(n => new MyXMLClass {FieldName = n.Attribute("name").Value, Value = n.Attribute("value").Value})
.ToArray();

Get nodes of XDocument

I have this Xml
<Content xmlns="uuid:28a55566-8657-4c56-9c44-">
<Image xlink:type="simple" xlink:href="/images/" xlink:title="albums_4" xmlns:xlink="http://www.w3.org/1999/xlink"></Image>
<Title>Europe</Title>
</Content>
And I want to get each node data. The result should be for image node for example:
<Image xlink:type="simple" xlink:href="/images/" xlink:title="albums_4" xmlns:xlink="http://www.w3.org/1999/xlink"></Image>
and <Title>Europe</Title> for Title node.
My C# code:
XDocument xDoc = XDocument.Parse(Xml);
XNamespace ns = xDoc.Root.GetDefaultNamespace()
var image = xDoc.Descendants(ns + "Image").Single().Value; //it returns ""

Value property in that particular usage returns what between the Image tags, that's why you got empty string. To get XML markup of the Image node you need to call .ToString() instead :
var image = xDoc.Descendants(ns + "Image").Single().ToString();

Why can I not read XML [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Why XDocument can’t get element out of this wellform XML text?
I'm trying to read an xml using linq to xml, and i guess i'm understanding something wrong.
This is the start of the xml (it's long so i'm not posting it all)
<?xml version="1.0" encoding="utf-8"?>
<Report xmlns="http://schemas.microsoft.com/sqlserver/reporting/2008/01/reportdefinition" xmlns:rd="http://schemas.microsoft.com/SQLServer/reporting/reportdesigner">
<Body>
<ReportItems>
<Tablix Name="Tablix12">
......
......
</Tablix>
This xml could have a few of "Tablix" elements, and might have 1 or none, for each one of these i want to read whats inside this tag and i'm having difficulty to start.
I have tried a few ways to get the "Tablix" elements, or any other element.
In this code i get a result only for the "var root", the rest of them are always null and i don't understand what i'm doing wrong.
public ReadTablixResponse ReadTablixAdvanced(string rdl)
{
XDocument xml = XDocument.Parse(rdl);
var root = xml.Root;
var Body = xml.Root.Element("Body");
var report = xml.Root.Element("Report");
var aa = xml.Element("Report");
var bb = xml.Element("Body");
var test = xml.Elements("Tablix");

One thing i noticed, is that you used the method Element("name"). which will always try to retrun the first (in document order) direct child element with the specified XName . and that is probebly why you got null.
if you want to return deeper elements(from where you looking). you need to use the Descendants("name") method, which will return a collection of all descendants elements . no matter how deep they are (relative to your chosen anchor)...
for example:
XNamespace xNameSpace = "http://schemas.micro.....";
// ...
var tablixes= xml.Descendants(xNameSpace + "Tablix");
which you can then wolk through:
foreach (var tablix in tablixes)
{
var name=(string)tablix.Attribute("Name");
var age=(int)tablix.Element("age");
...
}

XDocument xDocument = XDocument.Parse(rdl);
XNamespace xNameSpace = "http://schemas.microsoft.com/sqlserver/reporting/2008/01/reportdefinition";
var tablixes= from o in xDocument.Descendants(xNameSpace + "Tablix")
select o.Value;

How to query XElement with two namespaces

I'm trying to find the inner text value of an element using LINQ-to-XML (an XElement object). I make my service call and get an XML response back that I've successfully loaded into an XElement object. I want to extract the inner text of one of the elements - however, every time I try to do this, I get a null result.
I feel like I'm missing something super-simple, but I'm fairly new to LINQ-to-XML. Any help is appreciated.
I'm trying to get the inner text value of the StatusInfo/Status element. Here's my XML document that's returned:
<feed xml:lang="en-us" xmlns="http://www.w3.org/2005/Atom">
<title type="text">My Response</title>
<id>tag:foo.com,2012:/bar/06468dfc-32f7-4650-b765-608f2b852f22</id>
<author>
<name>My Web Services</name>
</author>
<link rel="self" type="application/atom+xml" href="http://myServer/service.svc/myPath" />
<generator uri="http://myServer" version="1">My Web Services</generator>
<entry>
<id>tag:foo.com,2012:/my-web-services</id>
<title type="text" />
<updated>2012-06-27T14:22:42Z</updated>
<category term="tag:foo.com,2008/my/schemas#system" scheme="tag:foo.com,2008/my/schemas#type" />
<content type="application/vnd.my.webservices+xml">
<StatusInfo xmlns="tag:foo.com,2008:/my/data">
<Status>Available</Status> <!-- I want the inner text -->
</StatusInfo>
</content>
</entry>
</feed>
Here's a snippet of code that I'm using to extract the value (which doesn't work):
XElement root = XElement.Load(responseReader);
XNamespace tag = "tag:foo.com,2008:/my/data";
var status = (from s in root.Elements(tag + "Status")
select s).FirstOrDefault();
My status variable is always null. I've tried several variations on this, but to no avail. The part that's confusing me is the namespace -- tag and 2008 are defined. I don't know if I'm handling this correctly or if there's a better way to deal with this.
Also, I don't have control over the XML schema or the structure of the XML. The service I'm using is out of my control.
Thanks for any help!

Try Descendants() instead of Elements():
XElement x = XElement.Load(responseReader);
XNamespace ns = "tag:foo.com,2008:/my/data";
var status = x.Descendants(ns + "Status").FirstOrDefault().Value;

There are 2 Namespaces in the feed:
the Atom namespace
the tag namespace
The outer xml needs to use the Atom namespace, while a portion of the inner xml needs to use the tag namespace. i.e.,
var doc = XDocument.Load(responseReader);
XNamespace nsAtom = "http://www.w3.org/2005/Atom";
XNamespace nsTag = "tag:foo.com,2008:/my/data";
// get all entry nodes / use the atom namespace
var entry = doc.Root.Elements(nsAtom + "entry");
// get all StatusInfo elements / use the atom namespace
var statusInfo = entry.Descendants(nsTag + "StatusInfo");
// get all Status / use the tag namespace
var status = statusInfo.Elements(nsTag + "Status");
// get value of all Status
var values = status.Select(x => x.Value.ToString()).ToList();

Direct access and edit to an xml node, using properties

El Padrino showed a solution:
How to change XML Attribute
where an xml element can be loaded directly (no for each..), edited and saved!
My xml is:
<?xml version="1.0" encoding="ISO-8859-8"?>
<g>
<page no="1" href="page1.xml" title="נושא 1">
<row>
<pic pos="1" src="D:\RuthSiteFiles\webSiteGalleryClone\ruthCompPics\C_WebBigPictures\100CANON\IMG_0418.jpg" width="150" height="120">1</pic>
</row>
</page>
</g>
and I need to select a node by two attributes(1. "no" in the page tag and "pos" in the pic tag)
I've found :
How to access a xml node with attributes and namespace using selectsinglenode()
where direct access is possible but beside the fact that I dont understand the solution, I think it uses the xpath object which can't be modified and save changes.
What's the best way to
access directly an xml node (I'm responsible that the node will be unique)
edit that node
save changes to the xml
Thanks
Asaf

You can use the same pattern as the first answer you linked to, but you will need to include the conditions on the attributes in the XPath. Your basic XPath would be g/page/row/pic. Since you want the no attribute of page to be 1, you add [#no='1'] as a predicate on page. So, the full XPath query is something like g/page[#no='1']/row/pic[#pos='1']. SelectSingleNode will return a mutable XmlNode object, so you can modify that object and save the original document to save changes.
Putting the XPath together with El Padrino's answer:
//Here is the variable with which you assign a new value to the attribute
string newValue = string.Empty;
XmlDocument xmlDoc = new XmlDocument();
xmlDoc.Load(xmlFile);
XmlNode node = xmlDoc.SelectSingleNode("g/page[#no='1']/row/pic[#pos='1']");
node.Attributes["src"].Value = newValue;
xmlDoc.Save(xmlFile);
//xmlFile is the path of your file to be modified

Use the new, well-designed XDocument/XElement instead of the old XmlDocument API.
In your example,
XDocument doc = XDocument.Load(filename);
var pages = doc.Root.Elements("page").Where(page => (int?) page.Attribute("no") == 1);
var rows = pages.SelectMany(page => page.Elements("row"));
var pics = rows.SelectMany(row => row.Elements("pic").Where(pic => (int?) pic.Attribute("pos") == 1));
foreach (var pic in pics)
{
// outputs <pic pos="1" src="D:\RuthSiteFiles\webSiteGalleryClone\ruthCompPics\C_WebBigPictures\100CANON\IMG_0418.jpg" width="150" height="120">1</pic>
Console.WriteLine(pic);
// outputs 1
Console.WriteLine(pic.Value);
// Changes the value
pic.Value = 2;
}
doc.Save(filename);

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Extracting data from the properties of an xml file - c#

Personally, I would recommend getting the whole content as a string, and then parse it as a html data using http://htmlagilitypack.codeplex.com/ library. That way you'll offload all the parsing to specialized libraries.

Related

Retrieve values from XML File

Get nodes of XDocument

Why can I not read XML [duplicate]

How to query XElement with two namespaces

Direct access and edit to an xml node, using properties

Categories

Resources