How to get all the top level htmlelements from C# WebBrowser?

How to get all the top level htmlelements from C# WebBrowser? - c#

I want to show the DOM as it is on the web browser with all comments and html, head, body, etc.. preserve its structure. Currently, I can only start from node html. Document.All didnt help.
The only way I can see is webBrowser1.Document.Body but I would miss the commentss, head etc.. Then if I go with Document.All then that gives me all the nodes.

I think the only choice with the WebBrowser control to get what you want is to use Document.All. Although this gives all elements not just top-level, each element has a .Parent element property so you can loop through them (or use Linq) and get only the ones that have <body> or <head> as the parent element.

Try using HTMLAgilityPack, it support Xpath so you can get any node as you want.

As suggested by hienvd_csuit, I think HTML Agility Pack is your best option. If you still want to use the WebBrowser, a possible solution is to access the unmanaged DOM directly, using dynamic (requires .NET 4+). For instance you can do something like this:
dynamic dom = wb.Document.DomDocument;
foreach (dynamic node in dom.childNodes)
{
Console.WriteLine ("{0} - {1} - {2}", node.nodeType, node.nodeName, node.nodeValue);
}
Of course, you need to know the structure of the DOM, since intellisense doesn't work on dynamic objects; you can find some information about it here.

You should be able to query (there is a property somewhere) if a particular item has a child node or not, also, you can query if it is a parent node or if a particular item has a parent or not, and if it does, discard, and you can keep querying for parent such as item.parent.parent (pls check intellisense for exact object/property names) and if it returns nothing, it means there is only one parent (assuming item.parent doesnt return nothing), and you can organize how many levels deep the nodes can/must be. So based on the child or parent checking method (or both) you can choose to either include it in your collection or discard it.
Of course, you might get many "P" tags or DIV/SPAN tag's as your top level nodes/items. So, i'm assuming there is a chance you will not want these, so feel free to discard them and query their children.

Related

What is the best way to associate a Nested Content node to an Umbraco Member

My initial thought was to create an Umbraco Relation and associate the Umbraco Member to the Nested Content node. Sadly, I found this form post asking a similar question and as you can see in Matt Bailsford's first response:
Unfortunately nested content can't have an ID value as they don't truely exist
I did find the issue/feature that was discussed in the forum post; however, it just adds parent information to the DetachedPublishedContent object and doesn't solve my issue. After reading the form post and the conversations of Hendy Racher, Matt Bailsford and Lee Kelleher in the github pull request, I still don't understand why Nested Content doesn't create a node in Umbraco.
So basically I need the Nested Content nodes to be created as Umbraco nodes and then stored as a JSON string in the property field. There are a few ways that I see this could be accomplished:
Create a Custom Property Editor for Umbraco Backoffice - I would start with a copy of Nested Content and add code to create the node and attach it before saving the node as a JSON string.
Use the Umbraco Multinode Treepicker control - This control was suggested by Hendy and Jeavon in this forum post as a way to allow a user to select multiple content nodes. Unfortunately going this route would require the user to create the "nested content" nodes first. Then they could associate those "nested content" nodes with the original node. We really like the user experience of the Nested Content control where it allows you to create nodes dynamically in the property editor.
Find a way to associate the Member to the "Nested Content" node - This option would require that I store an association between the top node and it's respective "nested content" node to a Member in Umbraco. There are two issues that come to mind when trying to go this route:
How should I associate the "nested content" node to the Member in a standard Umbraco way? - I immediately think of creating a link table in the database but, in my understanding, that is not the standard Umbraco way. I am still fuzzy about the best way to do this inside Umbraco.
Is there a way to uniquely identify the "nested content" node? - I realize there is a sort order value being set according to the pull request I found above but if the user reorders the nested content items will it change the "nested content" node to member association?
At this point, I am leaning towards going with option 1, but I wonder if option 3 is a better direction. In reality, I don't believe this is a new problem that someone hasn't already solved, and I hate to create another custom property editor if there is one just like it out there already.
So if you know of a better way to solve this problem please let me know.

The problem is - as you mention it - that Nested Content nodes aren't really real nodes. I don't think the right way to solve your problem is to try hacking Nested Content into doing something it really wasn't created to do.
The problem about creating nodes and also having references to them on the Nested Content node is that essentially every node in Umbraco needs to "live" somewhere.
You could choose to say that a node lives under the parent it is nested into, but how would you then differentiate between nested nodes and actual child nodes - this would require another hack as it is really working around how nodes are meant to be structured and handled in the Umbraco core.
Even if you did manage to get this working, I suspect you would have a lot to deal with to actually make it work as good as Nested Content currently does:
You would have somehow wrap every single node in the Nested Content editor into a object to be able to store meta data like the node ID it is connected to and the sortOrder when reordering all of the nested content nodes you have on there.
(edit: I think it actually already stores some sort of wrapper object here, but you would have to change the logic in here to actually handle a reference to another node instead of just deserializing json stored here, as a node)
You would also have to manually hook into events making sure the actual edits you do while on the parent node actually ends up being persisted to the nested nodes.
Deleting a nested content node, or a parent that has any nested nodes - you would have to handle deleting any orphaned nodes.
There's most likely a lot of stuff I've missed but my point is - you will have a lot of trouble trying to do this.
I think you should consider another approach if you really need to do this:
It would be possible to create a picker similar to a normal node picker, that simply allows you to browse through nodes as a normal picker would do. When you pick a node, instead of just selecting it, it should then fetch the nested content nodes and show those in the UI.
There's however the little quirk that you could essentially be having multiple properties storing each their set of nested content nodes on one single content item - so you would need some nice way of handling this in the UI.
When you select one or multiple of the nested nodes, what your picker would store would be something similar to [guid-of-the-real-node]_[propertyAlias]_[guid-of-nested-content-item].
I am not certain if Nested Content ever got the GUID unique ID/key feature implemented - Matt and I discussed it some time last year and we tried adding it in, in a custom build I needed for a project. If it isn't there I would suggest you ask Matt if he can get that in. It was essentially just giving each nested content item a "fake" unique ID (GUID) that you could use to identify it from other nested content items stored in the property. (You would have to ask Matt about the status of this)
Doing this would allow you to (on your member) have a reference that you lets you find the actual content node, then the property where the content is stored, and lastly the actual nested content node you have picked.
You should however note that this is very prone to breaking and needs a lot of null handling:
If you change the property alias of the property you are storing nested content in on the parent, it will lose the reference.
If you delete the content item storing the nested items, the picked items no longer exist and you have a missing reference in the picker on your member (needs null handling)
If you delete a nested content item - same as above. You have a missing reference in your picker.
Apart from the solution above, I don't really see another way of doing this currently with the requirements you have.

Searching the element using predicate condition

How could I find a framework element in VisualTree by a predicate?
Something like that :
public static FrameworkElement FindChild(FrameworkElement root, Predicate<> predicate)
{
...
}
I'm goint to use it something like that:
Button btn = FindChild(MainForm, element => element is Button);
Thanks for help in advance!

You may use LINQ to find out the controls of particular type, maybe like this:
List<Button> btns = Controls.OfType<Button>().ToList();

So the real question then is how to iterate throug all the children of the given "root" element.
Because then you'll be able to call your predicate for that element and choose those you want.
So I suppose you should distinguish here two different workflows - one - when the element is Panel, you should first pass it in, and then iterate over it's Children property and pass in every of those (both recursion and non-recursion will work, but you should go deeper into tree, and come back through levels in both cases). And in case of non panel element, just pass in that one to the predicate.
Also you should think about the elements, which have "Content" property (I suppose this is defined in some base type, which I don't remember which one is), so check for the content element the same way. And that's all.
Regards,
Artak

Answers to this SO question describes many ways to look for controls in visual tree.
The predicate version is given there as link to this.

Indexed access to XElement‘s child nodes

I am parsing an XML document using LINQ to XML and XDocument. Is there a way for a XElement/XContainer to get a child node by the index (in document order)? So that I can get the nth node of the element?
I know I can probably do that by getting all the child nodes of that element and converting that IEnumerable to a List, but that already sounds like it would add a highly redundant overhead (as I am only interested in a single child node).
Is there something I missed in the documentation?

No, there is no indexed access to a child element using XElement or XContainer. If you want indexed access, you have two options.
The first is to call the Elements method on XContainer (which returns an IEnumerable<T> of XElement instances in document order) and then use the Skip extension method to skip over the elements to reach the particular child.
If you want to access the child elements often by index, then you should place them in a IList<T> (which has indexed access), which is easy enough with the ToList extension method:
IList<XElement> indexedElements = element.Elements().ToList();

what about Skip(n).Take(1) operators

Maybe Take(topN) LINQ directive can help?
I'm making assumption based on some SQL related experience - you cannot get the row directly, but can take top n elements.
This can help if your list is huge and you don't encount last elements.

Can I find Logical Children by Type

I know i can use LogicalTreeHelper class to find children node for every element searching it by name. But is there a possibility to find a child node by Type? For example, what if i would like to find a ListBox element in my Window without knowing its Name property??

I don't think that there is a built in way of doing this. Probably the best approach would be to recursively call LogicalTreeHelper.GetChildren() until a child control of the specified type is found.

Note that descending the Logical tree cleanly is actually a little tricky, here's a nice article on the intricacies of both the visual and logical trees.
I don't think any helper code exists to do this for you so implementing a recursive walk over the tree is required.

Walking an XML tree in C#

I'm new to .net and c#, so I want to make sure i'm using the right tool for the job.
The XML i'm receiving is a description of a directory tree on another machine, so it go many levels deep. What I need to do now is to take the XML and create a structure of objects (custom classes) and populate them with info from the XML input, like File, Folder, Tags, Property...
The Tree stucture of this XML input makes it, in my mind, a prime candidate for using recursion to walk the tree.
Is there a different way of doing this in .net 3.5?
I've looked at XmlReaders, but they seem to be walking the tree in a linear fashion, not really what i'm looking for...
The XML i'm receiving is part of a 3rd party api, so is outside my control, and may change in the futures.
I've looked into Deserialization, but it's shortcomings (black box implementation, need to declare members a public, slow, only works for simple objects...) takes it out of the list as well.
Thanks for your input on this.

I would use the XLINQ classes in System.Xml.Linq (this is the namespace and the assembly you will need to reference). Load the XML into and XDocument:
XDocument doc = XDocument.Parse(someString);
Next you can either use recursion or a pseudo-recursion loop to iterate over the child nodes. You can choose you child nodes like:
//if Directory is tag name of Directory XML
//Note: Root is just the root XElement of the document
var directoryElements = doc.Root.Elements("Directory");
//you get the idea
var fileElements = doc.Root.Elements("File");
The variables directoryElements and fileElements will be IEnumerable types, which means you can use something like a foreach to loop through all of the elements. One way to build up you elements would be something like this:
List<MyFileType> files = new List<MyFileType>();
foreach(XElelement fileElement in fileElements)
{
files.Add(new MyFileType()
{
Prop1 = fileElement.Element("Prop1"), //assumes properties are elements
Prop2 = fileElement.Element("Prop2"),
});
}
In the example, MyFileType is a type you created to represent files. This is a bit of a brute-force attack, but it will get the job done.
If you want to use XPath you will need to using System.Xml.XPath.
A Note on System.Xml vs System.Xml.Linq
There are a number of XML classes that have been in .Net since the 1.0 days. These live (mostly) in System.Xml. In .Net 3.5, a wonderful, new set of XML classes were released under System.Xml.Linq. I cannot over-emphasize how much nicer they are to work with than the old classes in System.Xml. I would highly recommend them to any .Net programmer and especially someone just getting into .Net/C#.

XmlReader isn't a particularly friendly API. If you can use .NET 3.5, then loading into LINQ to XML is likely to be your best bet. You could easily use recursion with that.
Otherwise, XmlDocument would still do the trick... just a bit less pleasantly.

This is a problem which is very suitable for recursion.
To elaborate a bit more on what another poster said, you'll want to start by loading the XML into a System.Xml.XmlDocument, (using LoadXml or Load).
You can access the root of the tree using the XmlDocument.DocumentElement property, and access the children of each node by using the ChildNodes property. Child nodes returns a collection, and when the Collection is of size 0, you know you'll have reached your base case.
Using LINQ is also a good option, but I'm unable to elaborate on this solution, cause I'm not really a LINQ expert.
As Jon mentioned, XmlReader isn't very friendly. If you end up having perf issues, you might want to look into it, but if you just want to get the job done, go with XmlDocument/ChildNodes using recursion.

Load your XML into an XMLDocument. You can then walk the XMLDocuments DOM using recursion.
You might want to also look into the factory method pattern to create your classes, would be very useful here.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.