Traversing an arbitrary C# object graph using XPath/applying XSL transforms - c#

I've been looking for a component that would allow me to pass an arbitrary C# object to an XSL transform.
The naive way of doing this is to serialise the object graph using an XmlSerializer; however, if you have a large object graph, this could cause problems as far as performance is concerned. Issues such as circular references, lazy loading, proxies etc may further muddy the waters here.
A better approach is to have some kind of Adapter class that implements IXPathNavigable and XPathNavigator. One such example that I've encountered is the ObjectXPathNavigator from Byte-Force -- however, most of its key documentation is in Russian, and my initial tests seem to indicate that it has a few quirks and idiosyncrasies.
Does anyone know of either (a) any resources (overviews, tutorials, blog posts etc) about this particular in English or (b) any other alternatives that offer the same or similar functionality?

There's a (very) old MSDN article titled XPath Querying Over Objects with ObjectXPathNavigator that implements a similar class (Also called ObjectXPathNavigator, interestingly enough). I used this ages ago to query some data from Visual SourceSafe and build an RSS feed from the changelog, and it worked quite well. However, I didn't do XSLT with it, so I'm not sure if that works or not. Also, note that it was written for Framework 1.0, so you may need to update it for more recent frameoworks. Also, there may be better ways to do this now, but it would give you a starting point (and the article does a nice job of explaining how it works).

Sounds as though the problem you're trying to solve is quite interesting.
At first glance, I'd suggest writing your own implementation of an XPathNavigator descendant - there are only 20-odd methods to write, and none of them have a particularly difficult signature.
A naive implementation using non-cached reflection would be slow(ish) but would work well as a proof of concept and you could make changes to improve performance if/when that became an issue.
However ...
... I think you may run into some difficulties that stem from your approach, not from any implementation detail.
An XML file is (by nature) a simple hierarchy of elements and attributes - there are no loops (aka cycles) in the node graph.
An XPath expression can include the operator "//" which broadly means to search to unlimited depth. (For an exact definition, see section 2.5 of XPath 1.0.)
If you applied such an expression to an object graph with cross references (aka object cycles), then you run the risk of the XPath evaluator going into an infinite loop as it tried to recursively enumerate an effectively infinite graph.
You may be able to work around this issue by somehow keeping track of parent nodes in your XPathNavigator and throwing an exception if a loop is detected, but I'm not sure how viable this will be.

Since the object graph may be cyclic, you cannot possibly make a Tree-based structure out of it. Your best bet is to represent the object graph by it's simplest components: nodes and vectors.
More specifically, make each node (object) an element with a unique ID (perhaps provided by C#'s GetHashCode() method?). References to other objects (vectors) would be handled by referencing the ID of the object.
Example classes (note that I don't know C# so my syntax may be a bit off):
public class SomeType {
public int myInt { get; set; }
}
public class AnotherType {
public string myString { get; set; }
public SomeType mySomeType { get; set; }
}
public class LastType {
public SomeType mySomeType { get; set; }
public AnotherType myAnotherType { get; set; }
}
public class UserTypes{
static void Main()
{
LastType lt = new LastType();
SomeType st = new SomeType();
AnotherType atype = new AnotherType();
st.myInt = 7;
atype.myString = "BOB";
atype.mySomeType = st;
lt.mySomeType = st;
lt.myAnotherType = atype;
string xmlOutput = YourAwesomeFunction(lt);
}
}
Then we would expect the value of xmlOutput to be something like this (note that the ID values chosen are completely synthetic):
<ObjectMap>
<LastType id="0">
<mySomeType idref="1" />
<myAnotherType idref="2" />
</LastType>
<SomeType id="1">
<myInt>7</myInt>
</SomeType>
<AnotherType id="2">
<myString>BOB</myString>
<mySomeType idref="1" />
</AnotherType>
</ObjectMap>

You could try something like this:
http://code.google.com/p/antix-software/wiki/AntixReflectionQuery

Related

protobuf-net converts List<T> to List_T class when using .ToProto()

I have a requirement to take a library of C# classes that implement protobuf-net, and convert them into .proto files, which need to be converted using protoc into .py files. I understand that the .ToProto() function does this just fine, but I came up against an issue involving collections and generics when converting from .proto to .py files. When trying to serialize a list of DateTimes, for example I get the following error X.proto:64:13. "List_TimeSpan" is not defined. As this had not caused an issue upon serialization into a protobuf file, I wasn't aware of this situation at the time.
I am currently using proto-buf.net 2.3.2 for this project; it's the version some of my other work has been done with and I am aware that this could just be solved with a version upgrade. I'm just not sure if that is the answer with the digging I've done so far. If there's something else that I'm missing, I would truly appreciate any help that can be thrown my way.
If we consider:
[ProtoContract]
public class Foo {
[ProtoMember(12)]
public List<DateTime> Times { get; } = new List<DateTime>();
}
then GetProto<T>() in both v2.3.2 (the version mentioned in the question) and v2.4.4 (the current default version) generate:
syntax = "proto2";
import "protobuf-net/bcl.proto"; // schema for protobuf-net's handling of core .NET types
message Foo {
repeated .bcl.DateTime Times = 12;
}
So on the surface of it, it should already be just fine. If you're doing something more exotic (perhaps using a list in a dictionary value?), I'd be happy to help, but I'm going to need more of a clue as to what you're doing. Posting some C# that shows the thing you're seeing would be a great place to start.
Note that when protobuf-net first came around, there was no agreed transmission format for date/time-like values, so protobuf-net made something up, but it turns out to not be a convenient fit for cross-platform work; the following is a hard breaking change (it is not data compatible), but if possible, I would strongly recommend the well-known format that Google added later:
[ProtoContract]
public class Foo {
[ProtoMember(12, DataFormat = DataFormat.WellKnown)]
public List<DateTime> Times { get; } = new List<DateTime>();
}
which generates:
syntax = "proto2";
import "google/protobuf/timestamp.proto";
message Foo {
repeated .google.protobuf.Timestamp Times = 12;
}

Is there anything like RegEx, but for structured objects?

I feel like what I'm looking for should exist, but I don't know what it's called. All searches for "regex for objects" just return tutorials and questions about normal RegEx. Searches for "pattern matching" return news about C# 7's new pattern matching feature, which isn't what I'm trying to accomplish.
To illustrate what I'm after, assume you have the following class:
public class Car
{
public string Color { get; set; }
public int MilesDriven { get; set; }
public bool IsAllWheelDrive { get; set; }
}
And then assume you have a List of Car objects with random and varied properties. I'd like to be able to search through the list for RegEx-like patterns and get the beginning and ending indexes of each instance the pattern occurs.
Example patterns would be:
Find all instances where a white car with all wheel drive occurs between 2 blue cars and the first blue car has more than 1000 miles on it.
Find all instances where a red car is immediately followed by at least 2 green cars and eventually is followed by a car with less than 100 miles on it.
This is a bit of a contrived question, but I would like to know if anything like this exists, preferably as an existing C# library.
Apologies if "pattern-matching" isn't an applicable tag for this question, but as I stated, I don't really know what, if anything else, to call this.
What you are looking for is the filter concept. C# does provide an application of such a concept for lists, however it gets complicated if you aren't just trying to filter on the attributes of the object. Microsoft makes the filitering capability generic accross C# with LINQ See this post, or this post for an example:
Filtering collections in C#
Basically the syntax is :
var newlist = list.linqquery1.linquery2...linqueryN.Where(s.x condition);
of course if its simple enough you can do the following:
var newlist = list.Where(s.x condition);
But your problem also calls for selection based on subsequent items in the list. This is a whole lot more complicated because you won't be able to access those elements unless you attach that data to the element in the list. For example, if your listelem was actually a doubly linked list node, you could look ahead in the list for elements after ward based only on a forward node reference and run a condition like this (to check if two green cars follow):
var green2follows = carlist.Where(s.next.type == greencar && s.next.next.type == greencar);
You could, however, concieve of a situation in which it wouldn't be necesary to use doubly linked lists if you implimented this yourself with iteration. Unfortunately, because LINQ works primarily on enumeration based queries, you have to find work around to use Microsofts built in utilities for filtering (though this is not unique to microsoft, typically you don't include locality in queries) This post covers that conclusion.
To do this iteratively, you would create a for loop and test against i + 1, and i + 2 values of cars. Be carful as this gets messy with previous values (i - n) Iterators might be good for this to avoid errors, though i'm not sure if c# supports iterator arithmetic like other languages to allow you to go backwards and forwards. You may be forced to make a custom iterator to generically define this kind of filter.
EDIT: you can avoid creating a custom iterator and merely create a custom object returned by the iterator that supports forward and backward looking via iterator blocks (like the answer suggest in this post)
What you might do is something like this:
HandleObject<T>...
...
public bool backwardsWhere(condition)...
public bool forwardsWhere(condition)...
public bool backwardsNWhere(n, condition);
public bool forwardsNWhere(condition);
//other forwards filter functions for convienience
// get back the element that we wraped
public T get();
static IEnumerable<T> iteratorBlock(List<T> list)
{
foreach (int i = 0; i < list.Length; i++)
{
// yield means a new HandleObject is only created upon access, and
// need not be stored otherwise
yield return HandleObject<T>(i, list);
}
}
Instead of using your list and LINQing over that, use this iterator instead templated on your list, this way you don't have to deal with nasty iterator semantics in C# for creating your own whilist still being able to implement all the functionality you would need to go over the list.
var customIterableFromBlock = iteratorBlock<Car>(carlist);
var selectedCars = from handleCar in customIterableFromBlock
where ... //handleCar.backwardsWhere(...)&& handleCar.forwardsWhere(...)
//handleCar.get().x == condition etc
select Car
{
//get Car from handleCar.get()
};
The reason you would include i and the list itself in the HandleObject constructor is to allow for forward and backward searching through the list based on the position i passed in.

An easy way to validate an XML against a C# Class

I use the XML format to keep settings for my C# project.
Theses XMLs are deserialized into C# classes.
Often enough, a class design changes but I forget to update the XML. Deserialization usually works fine and missing elements just get their default values.
This behavior is not desirable for me. I would like to have automatic validation that asserts the class and the XML have exactly the same structure.
I know I can create XML schemas (e.g using XSD) for my classes, but I could not figure an easy way to do it automatically and not manually (recall my classes' design changes often). Also, this solution seems kind of unnatural. I do not really need XML schemas. I have classes and their serialized XML instances. Adding schemas seems superfluous.
Thanks a bunch.
Why you don't create a method in your settings class that can be invoked after xml deserialization?
Suppose you have a class like this:
class Settings {
public string Property1 { get; set; }
public int Property2 { get; set; }
public bool IsValid() {
if(string.IsNullOrEmpty(Property1)) return false;
if(Property2 == 0) return false;
}
}
Using IsValid you can check everything in your class. Please, remember that this is an example. I think is good to manage object validation. If you change something in the time, you can edit the validation method to check new situations.
To go with Roberto's idea and take it a step further you could get all the properties via reflection:
var props = yourClass.GetType().GetProperties()
Inside of your validation function you could loop over those properties with:
foreach(var prop in props) // or foreach(var prop in yourClass.GetType().GetProperties())
{
//...Validation of property
}
If one of the properties has its standard-value you throw a custom exception that tells you you did not update your XML-file properly.
You can implement this using Version Tolerant Serialization (VTS) https://msdn.microsoft.com/en-us/library/ms229752%28v=vs.110%29.aspx
The Serialization Callbacks is what you are looking for in the VTS capabilities

Best way to access attributes

I am working on a framework that uses some Attribute markup. This will be used in an MVC project and will occur roughly every time I view a specific record in a view (eg /Details/5)
I was wondering if there is a better/more efficient way to do this or a good best practices example.
At any rate, I have an a couple of attributes e.g:
[Foo("someValueHere")]
String Name {get;set;}
[Bar("SomeOtherValue"]
String Address {get;set;}
What is the most efficient way/best practice to look for these attributes/Act on their values?
I am currently doing something like this:
[System.AttributeUsage(AttributeTargets.Property)]
class FooAttribute : Attribute
{
public string Target { get; set; }
public FooAttribute(string target)
{
Target = target;
}
}
And in my method where I act on these attributes(simplified example!):
public static void DoSomething(object source)
{
//is it faster if I make this a generic function and get the tpe from T?
Type sourceType = source.GetType();
//get all of the properties marked up with a foo attribute
var fooProperties = sourceType
.GetProperties()
.Where(p => p.GetCustomAttributes(typeof(FooAttribute), true)
.Any())
.ToList();
//go through each fooproperty and try to get the value set
foreach (var prop in fooProperties)
{
object value = prop.GetValue(source, null);
// do something with the value
prop.SetValue(source, my-modified-value, null);
}
}
Attribute.GetCustomAttribute and PropertyInfo/MemberInfo.GetCustomAttribute is the recommended way of getting at attribute objects.
Although, I wouldn't normally enumerate all properties with attributes; you generally want to work a particular attribute so you'd just call GetCustomAttribute directly.If you're looking for attributes on any of your properties, enumerating those properties looking for attributes based on GetCustomAttribute() the way you're doing it, is the best way to do it.
There is not really much choice when dealing with attributes - your code is ok and reasonable as is, it is also unlikley to be your main performance concern. The only immediate thing is to drop ToList call as absolutely unnecessary.
Side notes: performance related question should look approximately
"I've measured my code and portion XXX seems to be taking too much time (YYY) . The time goal for this piece of code is ZZZ. Is my way of doing XXX reasonable/where can I improve it?".
Note that in you case you are missing YYY and ZZZ time portions - so you can't really say if it is slow for your case or not. And you may want to start measurements with DB/other IO bound operations as it more likely to speed up your overall code.
After you figured that this attribute related code is main perfomance issue you can consider some sort of caching of results or even code generation of some sort (either through caching lambdas that would set necessary values or even full blown IL generation).

Serialization - Viewing the Object Graph from a Stream

I'm wondering if there's a way in which I can create a tree/view of a serialised object graph, and whether anyone has any pointers? EDIT The aim being that should we encounter a de-serialization problem for some reason, that we can actually view/produce a report on the serialized data to help us identify the cause of the problem before having to debug the code. Additionally I want to extend this in the future to take two streams (version 1, version 2) and highlight differences between the two of them to help ensure that we don't accidently remove interesting information during code changes. /EDIT
Traditionally we've used Soap or XML serialization, but these are becoming too restricted for our needs, and Binary serialization would generally do all that we need. The reason that this hasn't been adopted, is because it's much harder to view the serialized contents to help fix upgrade issues etc.
So I've started looking into trying to create a view on the serialized information. I can do this from an ISerializable constructor to a certain extent :
public A(SerializationInfo info, StreamingContext context)
{}
Given the serialization info I can reflect the m_data member and see the actual serialized contents. The problem with this approach is
It will only display a branch from the tree, I want to display the entire tree from the root and it's not really possible to do from this position.
It's not a convenient place to interrogate the information, I'd like to pass a stream to a class and do the work there.
I've seen the ObjectManager class but this works on an existing object graph, whereas I need to be able to work from the stream of data. I've looked through the BinaryFormatted which uses an ObjectReader and a __BinaryParser, hooking into the ObjectManager (which I think will then have the entire contents, just maybe in a flat list), but to replicate this or invoke it all via reflection (2 of those 3 classes are internal) seems like quite a lot of work, so I'm wondering if there's a better approach.
You could put a List<Child class> in every parent class (Even if there the same)
and when you create a child you immediately place it in that list or better yet declare it whilst adding it the list
For instance
ListName.Add(new Child(Constructer args));
Using this you would serialize them as one file which contains the hierarchy of the objects and the objects themselves.
If the parent and child classes are the same there is no reason why you cannot have dynamic and multi leveled hierarchy.
In order to achieve what you describe you would have to deserialize whole object graph from stream without knowing a type from which it was serialized. But this is not possible, because serializer doesn't store such information.
AFAIK it works in a following way. Suppose you have a couple of types:
class A { bool p1 }
class B { string p1; string p2; A p3}
// instantiate them:
var b = new B { p1 = "ppp1", p2 = "ppp2", p3 = new A { p1 = true} };
When serializer is writing this object, it starts walking object graph in some particular order (I assume in alphabetic order) and write object type and then it's contents. So your binary stream will like this:
[B:[string:ppp1][string:ppp2][A:[bool:true]]]
You see, here there are only values and their types. But order is implicit - like it is written.
So, if you change your object B, to suppose
class B { A p1; string p3; string p3;}
Serialzer will fail, because it will try to assing instance of string (which was serialized first) to pointer to A. You may try to reverse engineer how binary serialization works, then you may be able to create a dynamic tree of serialized objects. But this will require considerable effort.
For this purpose I would create class similar to this:
class Node
{
public string NodeType;
public List<Node> Children;
public object NodeValue;
}
Then while you will be reading from stream, you can create those nodes, and recreate whole serialized tree and analyze it.

Categories

Resources