I have posted a question on here previously asking similar advise, but this project has evolved significantly, so I would like to ask for advice on how the experts would tackle this problem.
First, I will describe what the problem is, then how I have currently looked at it. Please, I want to learn - so do critise my approach/tell me what I can/should do better!
Requirements:
I have a log file decoder. I have three different systems generating log files. Each system is slightly different. There are seven different types of log files. Each log file can be in either ASCII format (human readable) or binary format (not human readable). So there are a lot of different logs - but many are similar. For example, for most, the binary and ascii is the same info in a different form.
There is also one log type which is in a totally different structure, i.e., if a, b and c are different values - each stored 6 times, most logs are type 1. One log is type 2.
type 1: abcabcabcabcabcabc
type 2: aaaaaabbbbbbcccccc
On top of this, each system has a status register. The three systems are all different in this respect. i.e. 7 * 8 bit registers, 3 * 32 bit registers... These need processing after the log is decoded (for the logs that contain the info) and then a chart needs to be plotted for other info (where required).
So, my solution so far:
I have a LogFile struct. This contains a DataTable to contain all the data. Also contains a few strings, such as serial numbers which are read from the log files and some Enums (log type, system type, encoding format)
I have a Parser class. This has some static methods : to Identify what logs are contained within a log file (An ASCII file can contain several different ones - the GUI will find out what is in there, ask the user which one they want and then decode it. Another static method to act as a factory and give back an instantiation of the Parser class - there are 3 types. One generic. One for binary of the (type 2, above) and one for ascii of the (type 2, above).
I have a SystemType class. This contains info such as status register meanings, log structures for each type. I.e. when decoding a type, the GUI will call the 'GetTable, which will give back a DataTable with columns of the fields to read from the file. The Parser can then just cycle through the columns, which will allow it to know what type of variable to read from the file (Int, Single, String, etc).
I have a Reader class. This is abstract and has two child classes - one for ascii, one for binary. So, I can call reader.ReadInt and it will handle appropriately.
There is also a class to generate charts and decode the status register. Status registers are just an array of array of strings, giving name and description of each bit. Perhaps this could be a struct - but does it make a difference? There is also a further class which analyses 3 values in one particular log and if they are present, will insert a column with a value calculated from them (they are strings).
The whole things just isn't very flexible, but I didn't want to write a different class for each of (3*7*2 =) 42 log types! They are too similar, yet different so I think that they would have resulted in a lot of duplicate code. This is why I came up with the idea of the DataTable and a generic Parser.
So, sorry for the long text!
I have a few other questions - I have used a DataTable for the data because I use a DataGridView in the GUI to display all of this to the user. I assumed this would simplify this, but is there a better way of doing this? When I bind the DataTable to the DataGridView, I have to go through each one looking for a particular row to highlight, adding tooltips and setting various column widths, which actually takes as long as the whole decoding process. So if there is a more efficient way of doing this, it would be great!
Thanks for any feedback!! Please, I can not have too much advice here as I have been playing around, rearranging for ages trying to get it in a way that I think is a nice solution, but it always seems clunky and very tightly coupled, espcially with the GUI.
You probably want a class instead of a struct.
I wouldn't use a DataTable unless I had to. I would instead use a List or something similar, you can still bind this to your DataGridView. For formatting the grid, if this is an option, buy a UI control library that will give you more options than the DataGridView does. My favorite is Telerik, but there are a bunch of them. If that isn't an option, then you'll have some custom UI logic (either JavaScript or row binding code behind) that will look at the record your binding and make decisions based on the properties of the class.
As far as the 42 different classes, all with similar code, create an abstract base class with the reusable code, and derive from this class in your different logtype classes, overriding the base functionality where needed.
Use interfaces to separate functionality that must be implemented by the logtype, and implement those interfaces. That way when you are iterating through a list of these classes, you know what functionality will be implemented based interface.
It sounds like you would greatly benefit from using interfaces to separate contract from implementation, and code to the contract to decouple your classes.
Hope this helps.
The only thing that pops out at me is this
I have a LogFile struct
Are you actually getting a benefit from it being a struct that outway the potential pitfalls?
From the guidelines
CONSIDER defining a struct instead of a class if instances of the type
are small and commonly short-lived or
are commonly embedded in other
objects.
DO NOT define a struct unless the type has all of the following
characteristics:
It logically represents a single value, similar to primitive types
(int, double, etc.).
It has an instance size under 16 bytes.
It is immutable.
It will not have to be boxed frequently.
Related
I'm a bit confused about C#'s use of attributes. At first I thought it was simply used to give program code additional information through the use of the [Obsolete] attribute. Now I find that [Dllimport] can be used to import a dynamic linked library and its functions. Can attributes import .exe files and other kind of files?
A last question, for programmers working in C# every day, how much do you use attributes, and do you use it for anything else than extending information and importing dll's?
Simply said, attributes are just metadata attached to classes or methods, at the very base.
The compiler, however, reads through your code, and runs specific actions for specific attributes it encounters while doing so, hardcoded into it. E.g., when it finds a DllImportAttribute on a method, it will resolve it to an external symbol (again, this is a very simplified explanation).
When it finds an ObsoleteAttribute, it emits a warning of deprecation.
Your own attributes (which you can create with a class inheriting from the Attribute base class) will not have an effect on the default compiler. But you (or other libraries) can also scan for them at runtime, opening up many possibilities and leading to your second question:
I typically use them to do meta programming. For example, imagine a custom network server handling packets of a specific format, implemented in different classes. Each packet format is recognized by reading an integer value. Now I need to find the correct class to instantiate for that integer.
I could do that with a switch..case or dictionary mapping integer -> packet which I extend every time I add a packet, but that is ugly since I have to touch code possibly far away from the actual Packet class whenever I add or delete a packet. I may not even know about the switch or dictionary in case the server is implemented in another assembly than my packets (modularity / extensibility)!
Instead, I create a custom PacketAttribute, storing an integer property set via the attribute, and decorate all my Packet classes with it. The server only has to scan through my assembly types at startup (via reflection) and build a dictionary of integer -> packet pairs automatically. Of course I could scan my assembly every time I need a packet, but that's probably a bit slow performance-wise.
There are APIs which are much more attribute heavy, like controllers in ASP.NET Core: You map full request URLs to methods in handler classes with them, which then execute the server code. Even URL parameters are mapped to parameters in that way.
Debuggers can also make use of attributes. For example, decorating a class with the DebuggerDisplayAttribute lets you provide a custom string displayed for the instances of the class when inspecting them in Visual Studio, which has a specific format and can directly show the values of important members.
You can see, attributes can be very powerful if utilized nicely. The comments give some more references! :)
To answer the second part of your questions, they are also used, for example, in setting validation and display attributes for both client and server side use in a web application. For example:
[Display(Name = "Person's age")]
[Required(ErrorMessage = "Persons's age is required")]
[RangeCheck(13, 59, ErrorMessage = "The age must be between 13 and 59")]
public int? PersonsAgeAtBooking { get; set; }
Or to decorate enums for use in display
public enum YesNoOnlyEnum
{
[Description("Yes")]
Yes = 1,
[Description("No")]
No = 2
}
There are many other uses.
I have a (not quite valid) CSV file that contains rows of multiple types. Any record could be one of about 6 different types and each type has a different number of properties. The first part of any row contains the timestamp and the type of record, followed by a standard CSV of the data.
Example
1456057920 PERSON, Ted Danson, 123 Fake Street, 555-123-3214, blah
1476195120 PLACE, Detroit, Michigan, 12345
1440581532 THING, Bucket, Has holes, Not a good bucket
And to make matters more complex, I need to be able to do different things with the records depending on certain criteria. So a PERSON type can be automatically inserted into a DB without user input, but a THING type would be displayed on screen for the user to review and approve before adding to DB and continuing the parse, etc.
Normally, I would use a library like CsvHelper to map the records to a type, but in this case since the types could be different, and the first part uses a space instead of comma, I dont know how to do that with a standard CSV library. So currently how I am doing it each loop is:
String split based off comma.
Split the first array item by the space.
Use a switch statement to determine the type and create the object.
Put that object into a List of type object.
Get confused as to where to go now because i now have a list of various types and will have to use yet another switch or if to determine the next parts.
I don't really know for sure if I will actually need that List but I have a feeling the user will want the ability to manually flip through records in the file.
By this point, this is starting to make for very long, confusing code, and my gut feeling tells me there has to be a cleaner way to do this. I thought maybe using Type.GetType(string) would help simplify the code some, but this seems like it might be terribly inefficient in a loop with 10k+ records and might make things even more confusing. I then thought maybe making some interfaces might help, but I'm not the greatest at using interfaces in this context and I seem to end up in about this same situation.
So what would be a more manageable way to parse this file? Are there any C# parsing libraries out there that would be able to handle something like this?
You can implement an IRecord interface that has a Timestamp property and a Process method (perhaps others as well).
Then, implement concrete types for each type of record.
Use a switch statement to determine the type and create and populate the correct concrete type.
Place each object in a List
After that you can do whatever you need. Some examples:
Loop through each item and call Process() to handle it.
Use linq .OfType<{concrete type}> to segment the list. (Warning with 10k
records, this would be slow since it would traverse the entire list for each concrete type.)
Use an overridden ToString method to give a single text representation of the IRecord
If using WPF, you can define a datatype template for each concrete type, bind an ItemsControl derivative to a collection of IRecords and your "detail" display (e.g. ListItem or separate ContentControl) will automagically display the item using the correct DataTemplate
Continuing in my comment - well that depends. What u described is actually pretty good for starters, u can of course expand it to a series of factories one for each object type - so that you move from explicit switch into searching for first factory that can parse a line. Might prove useful if u are looking to adding more object types in the future - you just add then another factory for new kind of object. Up to you if these objects should share a common interface. Interface is used generally to define a a behavior, so it doesn't seem so. Maybe you should rather just a Dictionary? You need to ask urself if you actually need strongly typed objects here? Maybe what you need is a simple class with ObjectType property and Dictionary of properties with some helper methods for easy typed properties access like GetBool, GetInt or generic Get?
i am creating records in a table and one column is called TYPE. I am programmatically looping through an enum in c# and creating this rows. The enum contains types of things for example
car
plane
boat
...
An Important thing is, that this types are bound to a logic. One my question:
Should i put these types in the enum as described above or would it be better to put these in a separate table to have a normalize form.
What would you prefer?
Depends on two things:
Are values unstable?1
Do you need to attach additional information?2
If the answer to any of the above questions is "yes", then using a dedicated lookup table is probably a good idea. Otherwise, constant enum values3 that are well-known and well-documented throughout the system are OK.
The point is: don't use lookup tables blindly, as is sometimes suggested. They certainly have their place, but there are also cases where they should not be used.
1 Existing value can change or be deleted, or new values can be added.
2 Such as human-readable (and potentially localizable) name or description, or some way to drive the logic from the contents of the database as opposed to hard-coding.
3 Usually simple integers. If you find yourself needing to use strings, that probably means you should have answered "yes" to question (2).
This may be way out in left field, crazy, but I just need to ask before I go on implementing this massive set of classes.
Basically, I'm writing a binary message parser that decodes a certain military message format into an object. The problem is that there are literally hundreds of different message types and they share almost nothing in common with each other. So the way I'm planning to implement this is to create hundreds of different objects.
However, even though the message attributes share nothing in common, the method for decoding them is fairly straightforward and follows a pattern. So I'm planning to write a code generator to generate all the objects and the decode logic for each message type.
What would be really sweet is if there was some way to dynamically create an object based on some schema. It doesn't necessarily have to be XML, but XML is pretty easy to work with.
Is this possible in C#?
I would like the interface to look something like this:
var decodedMessage = MessageDecoder.Decode(byteArray);
Where the MessageDecoder figures out what type of message it is and then returns the appropriate object. It will probably return an interface which implements a MessageType Property or something like that.
Basically what I'm wondering is if there is a way to have one object called Message, which implements a MessageType Property. And then Depending on the MessageType, the Message object transforms into whatever type of message it is, so I don't have to spend the time creating all of these message types.
ExpandOobject Where you can dynamically add fields to an object.
A good starting point is here.
Is xsd.exe what you are looking for? It can take an XML file or a schema and generate the c# classes. One problem that you might encounter though is that some of the military message formats are VERY obtuse. You could end up with some very large code files.
Look at T4 templates. They let you write code to generate code, they are integrated into the IDE, and they are quite easy really.
EDIT: There is no way to do what you are after with var, because var requires the right-hand side of the assignment to be statically typed (at compile time). I suppose that you could dynamically generate that statement, then compile and run it, but that's a very painful approach.
If you have XSD's for all of the message types, then you can use xsd.exe as #jle suggests. If not, then I am curious about the following:
// Let's assume this works
var decodedMessage = MessageDecoder.Decode(byteArray);
// Now what? I don't know what properties there are on decodedMessage, so I cant do anything with it.
This question arose when I was trying to figure out a larger problem that, for simplicity sake, I'm omitting.
I have to represent a certain data structure in C#. Its a protocol that will be used to communicate with an external system. As such, it has a sequence of strings with predefined lengths and integer (or other, more complicated data). Let's assume:
SYSTEM : four chars
APPLICATION : eight chars
ID : four-byte integer
Now, my preferred way to represent this would be using strings, so
class Message
{
string System {get; set; }; // four characters only!
string Application {get; set; }; // eight chars
int Id {get; set; };
}
Problem is: I have to ensure that string doesn't have more than the predefined length. Furthermore, this header will actually have tenths of fields, are those will change every now and then (we are still deciding the message layout).
How is the best way to describe such structure? I thought, for example, to use a XML with the data description and use reflection in order to create a class that adheres to the implementation (since I need to access it programatically).
And, like I said, there is more trouble. I have other types of data types that limits the number of characters/digits...
For starters: the whole length issue. That's easily solved by not using auto-properties, but instead declaring your own field and writing the property the "old-fashioned" way. You can then validate your requirement in the setter, and throw an exception or discard the new value if it's invalid.
For the changing structure: If it's not possible to just go in and alter the class, you could write a solution which uses a Dictionary (well, perhaps one per data type you want to store) to associate a name with a value. Add a file of some sort (perhaps XML) which describes the fields allowed, their type, and validation requirements.
However, if it's just changing because you haven't decided on a final structure yet, I would probably prefer just changing the class - if you don't need that sort of dynamic structure when you deploy your application, it seems like a waste of time, since you'll probably end up spending more time writing the dynamic stuff than you would altering the class.