How to get embed value and file name in a directory - c#

I have a docx file and want to generate a list of picture filenames/unique ids combinations.
Here is the relevant piece of the docx file:
<w:drawing>
<wp:inline distT="0" distB="0" distL="0" distR="0" wp14:anchorId="2C4CE07B" wp14:editId="12367BBF">
...
...
<a:graphicData uri="http://schemas.openxmlformats.org/drawingml/2006/picture">
<pic:pic xmlns:pic="http://schemas.openxmlformats.org/drawingml/2006/picture">
<pic:nvPicPr>
<pic:cNvPr id="2" name="ProfileGraph.png" />
<pic:cNvPicPr />
</pic:nvPicPr>
<pic:blipFill>
<a:blip r:embed="rId9">
<a:extLst>
so I need rId9 and ProfileGraph.png in one directory entry.
I can find the rId9:
var blipElements = from drawing in drawingElements
where drawing.Descendants<A.Blip>().Count() > 0
select drawing.Descendants<A.Blip>().First();
But I don't know how to get the cNvPr-elements belonging to each of the Blips in blipElements.
I was thinking in the line of
var names = from blip in blipElements
where blip.Ancestors<Picture>().First<Picture>().Descendants<....>()
Any help would be appreciated.

How about something like
var body = doc.MainDocumentPart.Document.Body;
var pics = body.Descendants<DocumentFormat.OpenXml.Drawing.Pictures.Picture>();
var result = pics.Select(p => new
{
Id = p.BlipFill.Blip.Embed.Value,
Name = p.NonVisualPictureProperties.NonVisualDrawingProperties.Name.Value
});
Where doc is assumed to be an already opened WordProcessingDocument object.
The result variable will be an IEnumerable of an anonymous type containg Id and Name properties.
I'm not particularly knowledgable on the word processing OpenXML stuff but, in theory, the Embed and Name properties could be null so I suppose you might have to test for null before accessing the '.Value' property.

Related

C# XML LINQ searching

I have a XML file like this:
<SoftwareComponent schemaVersion="1.0" packageID="Y75WC" releaseID="Y75WC" hashMD5="a190fdfa292276288df38507ea551a3b" path="FOLDER04650736M/1/OptiPlex_3050_1.7.9.exe" dateTime="2017-12-05T05:34:30+00:00" releaseDate="décembre 05, 2017" vendorVersion="1.7.9" dellVersion="1.7.9" packageType="LWXP" identifier="532f5a9e-c087-4499-b40c-cf7921ee06d3" rebootRequired="true">
<Name>
<Display lang="en"><![CDATA[Dell OptiPlex 3050 System BIOS,1.7.9]]></Display>
</Name>
<ComponentType value="BIOS">
<Display lang="en"><![CDATA[BIOS]]></Display>
</ComponentType>
<Description>
<Display lang="en"><![CDATA[This package provides the Dell System BIOS update and is supported on Dell OptiPlex 3050 Tower, OptiPlex 3050 Small Form Factor and OptiPlex 3050 Micro for Windows Operation System.]]></Display>
</Description>
<LUCategory value="NONE">
<Display lang="en"><![CDATA[None]]></Display>
</LUCategory>
<Category value="BI">
<Display lang="en"><![CDATA[FlashBIOS Updates]]></Display>
</Category>
<SupportedDevices>
<Device componentID="159" embedded="0">
<Display lang="en"><![CDATA[OptiPlex 3050 System BIOS]]></Display>
</Device>
</SupportedDevices>
<SupportedSystems>
<Brand key="1" prefix="OP">
<Display lang="en"><![CDATA[Optiplex]]></Display>
<Model systemID="07A3">
<Display lang="en"><![CDATA[3050]]></Display>
</Model>
</Brand>
</SupportedSystems>
<ImportantInfo URL="http://www.dell.com/support/home/us/en/19/Drivers/DriversDetails?driverId=Y75WC" />
<Criticality value="2">
<Display lang="en"><![CDATA[Urgent-Dell highly recommends applying this update as soon as possible. The update contains changes to improve the reliability and availability of your Dell system.]]></Display>
</Criticality>
There are multiple SoftwareComponent Elements inside.
I tried to get some attributes of SoftwareComponent ( dellVersion, hashMD5) based on descendants Elements ( ComponentType value, SupportedSystems->Device->Display value, Criticality value) but all my tests were not good.
See my actual code, I can get all the value in the XML file only:
XDocument doc = XDocument.Load("catalog.xml");
var els = from el in doc.Root.Elements("SoftwareComponent")
select new
{
dellVersion = (string)el.Attribute("dellVersion"),
hashMD5 = (string)el.Attribute("hashMD5"),
path = (string)el.Attribute("path"),
};
foreach (var el in els)
{
Console.WriteLine("dell BIOS: {0}, MD5: {1}, path: {2}", el.dellVersion, el.hashMD5, el.path);
}
Somebody can show me how to proceed please ?
Thanks
First of all, your XML document is missing a </SoftwareComponent> end tag. Maybe you didn't copy the contents OK here.
Then, SoftwareComponent is actually the root in your document, so you would need code like:
XDocument doc = XDocument.Load("catalog.xml");
var el = new
{
dellVersion = (string)doc.Root.Attribute("dellVersion"),
hashMD5 = (string)doc.Root.Attribute("hashMD5"),
path = (string)doc.Root.Attribute("path"),
};
Console.WriteLine("dell BIOS: {0}, MD5: {1}, path: {2}", el.dellVersion, el.hashMD5, el.path);
Your code would work OK as-is if the XML would have the format:
<Root>
<SoftwareComponent schemaVersion="1.0" packageID="Y75WC" releaseID="Y75WC" hashMD5="a190fdfa292276288df38507ea551a3b" path="FOLDER04650736M/1/OptiPlex_3050_1.7.9.exe" dateTime="2017-12-05T05:34:30+00:00" releaseDate="décembre 05, 2017" vendorVersion="1.7.9" dellVersion="1.7.9" packageType="LWXP" identifier="532f5a9e-c087-4499-b40c-cf7921ee06d3" rebootRequired="true">
</SoftwareComponent>
</Root>
XML documents can only have one root node, so you can't have multiple SoftwareComponent as root as you seem to imply.
If you want to get, for example, ComponentType value, you can do:
componentTypeValue = (string)el.Descendants("ComponentType").FirstOrDefault().Attribute("value")
I would actually change the query into a foreach and check that FirstOrDefault result for null.

Getting attributes from xml using Linq

I have an xml document that I want to obtain attributes from
Here is the XML:
<Translations>
<Product Name="Room" ID="16">
<Terms>
<Term Generic="Brand" Product="Sub Category" />
<Term Generic="Range" Product="Brand" />
</Terms>
</Product>
<Product Name="House"" ID="29">
<Terms>
<Term Generic="Category" Product="Product Brand" />
<Term Generic="Brand" Product="Category Description" />
<Term Generic="Range" Product="Group Description" />
<Term Generic="Product" Product="Product Description" />
</Terms>
</Product>
</Translations>
Here is my current Linq query
public static string clsTranslationTesting(string GenericTerm, int ProductID)
{
const string xmlFilePath = "C:\\Dev\\XMLTrial\\XMLFile1.xml";
var xmlDocument = XDocument.Load(xmlFilePath);
var genericValue =
from gen in xmlDocument.Descendants("Product")
where gen.Attribute("ID").Value == ProductID.ToString()
select gen.Value.ToString();
}
The error that I am having is when I pass data into the method, the method loads the xml from the file to the xmlDocument variable successfully. However when it executes the query it returns a value null. I want to obtain the ID value.
I'm a little lost with your question, but here's my attempt.
First thing is you need to change "Customer" to "Product". Your XML contains not a single instance of the word "Customer" so I think you have a typo there.
I don't know exactly what you want returned from the query (I assume just the entire matched node?). Try this:
var genericValue = xmlDocument.Descendants("Product")
.FirstOrDefault(x => x.Attribute("ID").Value == "16");
I made a fiddle here that shows it in action

How to sort XML files by a node attribute in C#

Not asking for anyone to code this solution for me - just looking for guidance on the best approach. I'm working on an .aspx file in VS2015 using C# code behind.
I've found countless threads explaining how to sort nodes within an XML file. But, I have not found any threads on how to sort multiple XML files with the same structure, according to a common child node attribute.
My situation: I have a directory of hundreds of XML files named, simply, 0001.xml through 6400.xml. Each XML file has the same structure. I want to sort the files (not the nodes) according to the attribute of a child node.
Each XML file has an "item" parent node and has child nodes "year", "language", and "author", among others. For example:
<item id="0001">
<year>2011</year>
<language id="English" />
<author sortby="Smith">John F. Smith</author>
<content></content>
</item>
If, instead of listing the files in order 0001 thru 6400, I instead want to list them in alphabetical order according to the item/author node's #sortby attribute, how would I do that?
One idea that I had was to create a temporary XML file that gathers the information needed from each XML file. Then, I can sort the temporary XML file and then loop through the nodes to display the files in the proper order. Something like this...
XDocument tempXML = new XDocument();
// add parent node of <items>
string[] items = Directory.GetFiles(directory)
foreach (string item in items)
{
// add child node of <item> with attributes "filename", "year", "language", and "author"
}
// then sort the XML nodes according to attributes
Does this make sense? Is there a smarter way to do this?
Sorting
We can show xml files sorted using a bit of LINQ to Xml, with this following code:
var xmlsWithFileName = Directory.GetFiles(directory)
.Select(fileName => new { fileName, xml = XDocument.Parse(File.ReadAllText(fileName)) })
.OrderBy(tuple => tuple.xml.Element("item").Element("author").Attribute("sortby").Value);
Each element of xmlsWithFileName will have
xml property, that contains de XML in XDocument
fileName property, that contains the path of the XML file
Assuming that in your target directory you have this xml files:
0001.xml
<item id="0001">
<year>2011</year>
<language id="English" />
<author sortby="Smith">John F.Smith</author>
<content></content>
</item>
0002.xml
<item id="0002">
<year>2012</year>
<language id="Portuguese" />
<author sortby="Monteiro">Alberto Monteiro</author>
<content></content>
</item>
You can use this code to test
public static void ShowXmlOrderedBySortByAttribute(string directory)
{
var xmlsWithFileName = Directory.GetFiles(directory)
.Select(fileName => new { fileName, xml = XDocument.Parse(File.ReadAllText(fileName)) })
.OrderBy(tuple => tuple.xml.Element("item").Element("author").Attribute("sortby").Value);
foreach (var xml in xmlsWithFileName)
{
Console.WriteLine($"Filename: {xml.fileName}{Environment.NewLine}Xml content:{Environment.NewLine}");
Console.WriteLine(xml.xml.ToString());
Console.WriteLine("================");
}
}
And the output of this code is:
Filename: c:\temp\teste\0002.xml
Xml content:
<item id="0002">
<year>2012</year>
<language id="Portuguese" />
<author sortby="Monteiro">Alberto Monteiro</author>
<content></content>
</item>
================
Filename: c:\temp\teste\0001.xml
Xml content:
<item id="0001">
<year>2011</year>
<language id="English" />
<author sortby="Smith">John F.Smith</author>
<content></content>
</item>
================
As you can see, the XML 0002.xml appear in first position, then the 0001.xml
Edit: And now that I think about it, you probably want the file contents and not the file name, if that's the case, you could instead replace the "items" array in this example with a collection of strings containing the file contents and use GetAuthor to go through that string and return the author name.
I think the best solution would be to add these file names to some sort of collection that can be sorted. This will take your file names and add them to a Lookup:
var lookup = items.ToLookup(a => GetAuthor(a)).OrderBy(a => a.Key);
This is going to rely on a method that uses the file name to get the author name:
private string GetAuthor(string filename)
{
string author = String.Empty;
// get author name logic
return author;
}
And finally, to interate through your list:
foreach (IGrouping<string, string> author in lookup)
{
foreach (string file in author)
{
Console.WriteLine(String.Format("{0}: {1}", author.Key, file ));
}
}
If you decide you want to sort the list based on more than one criteria, you'll have to take a different approach and create a custom object, add those to a list and use a custom IComparer, but this example will allow you to avoid all that if you only care about the author name.
If I understand what you are saying correctly, this is how I would go about it:
SortedDictionary<string, string> dict = new SortedDictionary<string, string>();
var files = Directory.GetFiles(#"[path to files]", "*.xml");
foreach (var item in files)
{
XDocument doc = XDocument.Load(item);
var sortvalue = (from lv1 in doc.Descendants("somesortvalue")
select lv1.Value).First();
dict.Add(sortvalue, item);
}
Then you can do a foreach on the dict.keys and the filenames will be sorted by the dictionary functionality.
Have two ways to sort data of XML file by InnerText of it's nodes
Use Linq
You can load all Item to list and orderby by Element of childnode.
You can make a function with one para is name of childnode to do that.
You can use XSLT to transform
Refer Sorting of XML file by XMLElement's InnerText for more detail
Hope it help!
You can load items using XElement and sort them this way:
var items = System.IO.Directory.GetFiles(#"path", "*.xml")
.Select(file => System.Xml.Linq.XElement.Load(file));
.OrderBy(x => x.Element("author").Attribute("sortby").Value)
.ToList();
Also if you need file names, you can select an object containing FileName and Item:
var items = System.IO.Directory.GetFiles(#"path", "*.xml")
.Select(file => new
{
FileName = file,
Item = System.Xml.Linq.XElement.Load(file)
})
.OrderBy(x => x.Item.Element("author").Attribute("sortby").Value)
.Select(x=>x.FileName) /*or .Select(x=>x.Item)*/
.ToList();

Extracting data from the properties of an xml file

I am attempting to extract data from an xml file generated from a save function. Here is what the xml looks like when the data has been serialized
<Data>
<ParentID>00000000-0000-0000-0000-000000000000</ParentID>
<Content><ContentControl xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation"> <Grid><Image Source=".//Resources/Images/start.png" Tag="Start" ToolTip="Start" IsHitTestVisible="False" /></Grid></ContentControl> </Content>
</Data>
I can read the data between the <> signs using an XElement object and extract it value using Element("Child").Value for example the ParentID but I do not know how to extract the property data from within Content tags such as the programmatic reading the Tag property of the Image, in this case Tag='Start'.
Can someone please assist me to resolve this matter
If the problem you are running into is that the data in the Content node is a malformed fragment, then this is a way to extract that, fix the malformation and get at the data.
string asReadXml = #"<Data>
<ParentID>00000000-0000-0000-0000-000000000000</ParentID>
<Content><ContentControl xmlns=""http://schemas.microsoft.com/winfx/2006/xaml/presentation""> <Grid><Image Source="".//Resources/Images/start.png"" Tag=""Start"" ToolTip=""Start"" IsHitTestVisible=""False"" /></Grid></ContentControl> </Content>
</Data>";
var fragment = Regex.Match(asReadXml, #"(?:\<Content\>)(?<Xml>.+)(?:\</Content\>)", RegexOptions.ExplicitCapture).Groups["Xml"].Value;
var validFragment = Regex.Replace(Regex.Replace(fragment, "(<)", "<"), "(>)", ">");
var xDoc = XDocument.Parse("<Root>" + validFragment + "</Root>");
/* XDoc looks like this:
<Root>
<ContentControl xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation">
<Grid>
<Image Source=".//Resources/Images/start.png" Tag="Start" ToolTip="Start" IsHitTestVisible="False" />
</Grid>
</ContentControl>
</Root>
*/
var Image =
xDoc.Root
.Descendants()
.Where (p => p.Name.LocalName == "Image")
.First ();
Console.WriteLine ( Image.Attribute("Tag").Value );
// Outputs
// Start
var data = #"<Data>" +
"<ParentID>00000000-0000-0000-0000-000000000000</ParentID>" +
"<Content><ContentControl xmlns=\"http://schemas.microsoft.com/winfx/2006/xaml/presentation\">"+
"<Grid><Image Source=\".//Resources/Images/start.png\" Tag=\"Start\" ToolTip=\"Start\" IsHitTestVisible=\"False\" /></Grid></ContentControl>" +
"</Content>" +
"</Data>";
var root = XElement.Parse(data);
var contentValue = root.Element("Content").Value;
var contentXml = XElement.Parse(contentValue);
var ns = contentXml.Name.Namespace; // retrieve the namespace
var imageTagValue = contentXml.Element(ns+"Grid").Element(ns+"Image").Attribute("Tag").Value; //
Assume that element is an XElement object that represent <Content> element (You already have a way to get it though), you can do as follow to get Tag attribute value of Image element :
XElement element = ....;
var content = XElement.Parse((string)element);
var ns = content.Name.Namespace;
var image = content.Descendants(ns + "Image").FirstOrDefault();
var tag = "";
if(image != null)
{
tag = (string)image.Attribute("Tag");
}
We check if image is null before looking for it's attribute. With that, you won't get exception if there any <Content> element that doesn't have <Image> element). tag variable will simply contains empty string in that case.
This also handle case when <Content> has <Image> element resides in different path (not under <Grid> element).
Personally, I would recommend getting the whole content as a string, and then parse it as a html data using http://htmlagilitypack.codeplex.com/ library. That way you'll offload all the parsing to specialized libraries.

update one xml based on another xml with repetitive elements with linq to xml

I have a source xml:
<Source>
<First>
<Name>Name1</Name>
</First>
<First>
<Name>Name2</Name>
</First>
</Source>
I have an empty target xml, where I want to copy data from the source xml.
Empty target xml is:
<Target>
<Second>
<FirstName></FirstName>
</Second>
<Second>
<FirstName></FirstName>
</Second>
</Target>
After copy the target xml will look:
<Target>
<Second>
<FirstName>Name1</FirstName>
</Second>
<Second>
<FirstName>Name2</FirstName>
</Second>
</Target>
I'm looking for an easy linq to xml solution. The problem is, that I don't know how to update repetitive elements in target xml based on repetitive elements from source xml.
thanks.
You can do that using the following code:
var source = "<Source><First><Name>Name1</Name></First><First><Name>Name2</Name></First></Source>";
var sourceDocument = XDocument.Load(new StringReader(source));
var target = "<Target><Second><FirstName></FirstName></Second><Second><FirstName></FirstName></Second></Target>";
var targetDocument = XDocument.Load(new StringReader(target));
var sourceNameElements = sourceDocument.Descendants("First").Select(first => first.Element("Name")).ToList();
var targetNamesElements = targetDocument.Descendants("Second").Select(second => second.Element("FirstName")).ToList();
for (var i = 0; i < sourceNameElements.Count; ++i)
{
targetNamesElements[i].SetValue(sourceNameElements[i].Value);
}
Console.WriteLine(targetDocument.ToString());
I don't know the easiest way to solve this ,this is best way I can think out,and you must ensure Name.count is larger than FirstName.count:
var sourceXml =XElement.Parse(source);
var targetXml = XElement.Parse(target);
var i = 0;
var nameArray = (from name in sourceXml.Descendants("Name")
select name.Value).ToArray();
foreach (var fName in targetXml.Descendants("FirstName"))
{
fName.Value = nameArray[i++];
}

Categories

Resources