is there a "Data Conversion Object" principle/pattern?

is there a "Data Conversion Object" principle/pattern? - c#

I recently had a question regarding String Object with fixed length C# .
(Please read this question first)
Some of the answers, which were given, pointed out that my design might be flawed.
Since the last question was about Strings with a fixed length this one is about the underlying principle. This question might be a little bit long so pleas bear with me.
Requirements:
I have a plain textfile with values in it with a specified fixed length. The standard for this textfiles is from the 90's. I have to create such a file.
A File may contain 1-60 Rows.
There are 10 different types of Rows.
A Row has between 10-40 values.
A Row is specified like this:
Back in the 90's there was an application which created those files placed it on a Server and the server then read the File and did something with it like writing it to the database or informing somebody that something went wrong etc.
This application isn't usable anymore due to recent legal changes.
Suggested design
The new Application that is in its place doesn't provide any data in the form of an export but it has a database with the values inside. I have the responsibility to write a converter. So I have to get the data and write an exported text file. The Data is only send and never received !
Question
Since a A DTO's only purpose is to transfer state, and should have no behavior(POCO vs DTO)
Is there something like a "Data Conversion Object" which has the purpose of converting data which is transfered ? Is there a design pattern which is applicable ?

I recently designed a solution for a similar problem, though my solution was in SAS language, which is not Object-Oriented. But, to me it seems that the problem is pretty much the same. Now, lets dissect the problem:
The problem:
There are some plain text files.
These files have specification, about the layout, fields, types etc.
These files need to be converted to some other format.
Solution (Objected-Oriented):
I would define three classes, PlainTextFile, Specification, Output, and a Reader Class.
Specification: Contractor takes an specification (probably it is stored in a file or so), and parses that into an Specification object.
PlainTextFile: This can be handle to a text file, or a wrapper around the handle if some other feature is added to it. I prefer the second option.
Output: This is the output you would like to produce.
Reader: It takes two inputs, PlainTextFile and Specification. Uses Specification to read and parse the PlainTextFile and write the output in the Output object/format.
Now, the output can be the final step or not. I suggest, that the Reader do only this much. It you want to write the output to a database, or send it somewhere, create another class to do this.
Remember, I don't know what the name of this pattern. Actually, I don't think that matter much. For me, this method solved a problem that existed in the company for a decade and it integrated two of the most used systems there.

Related

Using application data structures other than xml

I'm designing a survey tool. The survey will be very static and because of that, I can avoid building some kind of table-driven survey designer to accommodate the 167 questions on the survey (all 1-5 rating questions in a radio box or checkbox layout).
I was thinking of building the survey questions in a large XML file, but my non-technical co-worker that will be making frequent edits to the survey will likely do things that will break the integrity/validity of the raw xml file (think punctuation and special characters).
The XML file might look something like:
<questions>
<question>
<type>checkbox</type>
<text>Which beers do you like most</text>
<choices>Bud,Miller,Piels</choices>
<Required>true</Required>
</question>
<question>
<type>radio</type>
<text>Which beer is your favorite</text>
<choices>Bud,Miller,Piels</choices>
<Required>true</Required>
</question>
</questions>
Please use your imagination that this structure will be a bit more complex and that there will be 165 more questions.
Complicating matters, I need these questions in some form of object-oriented layout so that I can take the results and align them to other stuff. I had considered hard-coding a very lengthy survey form with 167 questions, but I need the data in blocks so that I can parse out question 37 and align it to something else in some other feature, that is related to question 37.
Here's what I'd like to do in a .Net app:
Define a enumerable class for this.
Do something where I can manually fill an enumerable collection of this class with all of the data I need. Using the p-code that would be familiar in my .asp world . . .
questions q = new questions()
q.type = "checkbox";
q.text = "which beers do you enjoy"'
q.choices = "Bud,Miller,Peils";
q.required = true;
q.add
q.type = "radio";
q.text = "what is your favorite beer";
q.choices = "Bud,Miller,Peils";
q.required = true;
q.add
My hope is that this .cs file (though foreign looking to the lay person) would be much easier for my co-worker to maintain, without me having to worry about syntax errors.
So, I guess what I'm looking for some feedback on:
Is this just a dumb idea. Should I do this in XML and I'll just consume the XML file and be done with it.
WWYD - What would you do? Is there an easier way to do this?
I don't care about performance as a relatively small number of users are using this.
I don't care about maintainability, because we will write this feature properly in the summer.
I just need to create a data structure that is not in a DB and that can be maintained by a non-technical person with a text-editor (for now).
If anyone made it this far, I appreciate it.

Everyone uses Excel...so consider using a CSV format which can be read by you as well as Excel which your counterpart will be using. One must specify to the user that the columns can't be changed, which is not a drawback per-se, but the user exports the dynamic changes to CSV which the program reads and can verify.
Plus the user does not have to be trained to use Excel so it is a win/win situation per your requirements not to use XMl.

As permanent store XML is good.
But that does not mean the user needs to edit the XML directly.
I would build the ability to edit, add, and delete the questions in the app.
Yes a bit a trouble but if they hack the XML then that is also a lot of trouble.
How do you plan to save survey results?
How do you plan to collect the survey results?
There is more to this project than you are realizing.
Do you need to combine results from more than one device?
If more than one device then you need to separate the questions from the results so you can update the questions on more than one device.
There are tools to read and write XML to disk.
Reading XML with the XmlReader
I don't agree with doug that you need to embed a database.
For a small number of questions I would use XML.
I would read all the XML into an object collection (A List).
You don't need a class the implements IEnumerable.
You put you objects in a a collections that implements IEnumerable.
I would go WPF over WinForms.
A ListBox with a DataTemplate.
On the DataTemplate you can have a dynamic selector in code behind but that is a real hassel.
Consider a single template that you manipulate in code behind.
So they are not RadioButtons but you uncheck the others in code behind.
For filtering I would go LINQ in public properties but there is also CollectionViewSource.
Used XML for an app that was used to collect field measurements.
A lot like this in measuring devices could change and need to collect the measurements.
If you are set on user editing the questions directly then XML with XSD is the best I can think of.

If you are looking for a simple human readable structured format, then you could be interrested by YAML.
YAML is a human-readable data serialization format that takes concepts
from programming languages such as C, Perl, and Python, and ideas from
XML and the data format of electronic mail.
Your question file would look like this:
questions:
- id: 1
type: checkbox
text: Which beers do you like most
choices: Bud,Miller,Piels
Required: true
- id: 2
type: radio
text: Which beer is your favorite
choices: Bud,Miller,Piels
Required: true
Some YAML libraries exists in .NET (from the article):
https://github.com/aaubry/YamlDotNet
http://yaml.codeplex.com/
http://www.codeproject.com/Articles/28720/YAML-Parser-in-C
http://yaml-net-parser.sourceforge.net/

There are plenty of xml editing tools out there that will actually make it easier to edit than editing a text file directly. I use XML Marker and it's pretty easy to use. http://symbolclick.com/
It will be quicker to train them to edit using the tool than it will be to build one.

Two answers here;
a: Write it to allow a proper admin interface, using a database to allow admin users to add/edit questions, response options and include appropriate security, auditing etc. You mention that this may not be feasible in the short term or that a 'proper' feature will be added soon, in which case, scrap this!
b: People say they have frequent edits/changes to make, but is this not a requirement which is co-related to a complete feature? Could you not in the short term, accept manual requests for change via email or something else documented, and make them yourself? Do you think the time taking to add a question/response or change some wording would be less than needing to parse XML manually to find a syntax error from someone who isn't familiar?
You'll need to weigh up frequency of change with impact to yourself of making a change vs likelihood of user error, vs estimated time needed to identify and resolve a syntax error (plus the possible bad-will of having a change break things).
Despite what some people think, users don't like making mistakes! putting them in a position where they have admin level powers over a system they don't have a full technical grasp of, could reduce confidence and future buy-in to the feature you're due to develop.
TLDR; In my opinion, unless it's a major hassle, do the changes yourself in the short term, perhaps with a maximum amount of time you'll make them (I make one change set a week, on a Friday for example). Keep the system working perfectly, and involve the users without putting them in an uncomfortable position being an non voluntary early adopter for a feature which isn't finished.

I used my complete mastery over winforms to create a little mock GUI application that enables users to quickly create one dimensional non conditional lists of questions with different question types.
Once you decided on an xml scheme you can easily import and export xml files.
Are you interested in further development of the magical survey creator? If so tell me and I will send you a practically finished prototype tomorrow morning. (You should provide me with an xml scheme though, otherwise I will do it in CSV)
I enjoy the exercise.
Picture related. Don't be put off by the colors, that's how I like it during development, to see the pixel exact boundaries of controls.
Unless your coworkers have some experience with programming or xml editing they will hate you if you instruct them to edit any sort of "code".
Our secretaries put their hand in front of their faces and start chanting "no, no, no..." when I tell them how to operate VBA macros.

How does OLEDB deals with mixed data types when working with .csv files?

I have been writing an implementation for reading a .csv data file into C# datatable and do some basic string manipulations.
It all works well until it meets the cell with mixed data type. for example, "40C". Then it doesn't even read, it just skips it.
I've been researching online on possible ways to solve this issue but it looks like there is no other way to do it.
I've read it somewhere that I would need to use SQL or Acess to make it work but then wouldn't I run into the same problems when dealing with mixed data types?
Does any one know of a better way to solve this problem? I would really love to stay with .csv extension though.
P.S. I am not posting my code due to the fact that it works but not for mixed data types but if you would insist, I can post the code.
Thank you in advance!

Answering my own question:
Basically, I tried to use Schema.ini file but it still didn't work.
Instead, I found a simpler solution. The problem was easily fixed by simply changing ImportMixedTypes setting to "Text" in RegEdit here: HKEY_LOCAL_MACHINE\SOFTWARE\Wow6432Node\Microsoft\Jet\4.0\Engines\Text on 64-bit machine or HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Jet\4.0\Engines\Text on a 32-bit machine.

Manipulating a Python file from C#

I'm working on some tools for a game I'm making. The tools serve as a front end to making editing game files easier. Several of the files are python scripting files. For instance, I have an Items.py file that contains the following (minimalized for example)
from ItemModule import *
import copy
class ScriptedItem(Item):
def __init__(self, name, description, itemtypes, primarytype, flags, usability, value, throwpower):
Item.__init__(self, name, description, itemtypes, primarytype, flags, usability, value, throwpower, Item.GetNextItemID())
def Clone(self):
return copy.deepcopy(self)
ItemLibrary.AddItem(ScriptedItem("Abounding Crystal", "A colourful crystal composed of many smaller crystals. It gives off a warm glow.", ItemType.SynthesisMaterial, ItemType.SynthesisMaterial, 0, ItemUsage.Unusable, 0, 50))
As I Mentioned, I want to provide a front end for editing this file without requring an editor to know python/edit the file directly. My editor needs to be able to:
Find and list all the class types (in this example, it'd be only
Scripted Item)
Find and list all created items (in this case there'd only be one,
Abounding Crystal). I'd need to find the type (in this
caseScriptedItem) and all the parameter values
Allow editing of parameters and the creation/removal of items.
To do this, I started writing my own parser, looking for the class keyword and when these recorded classes are use to construct objects. This worked for simple data, but when I started using classes with complex constructors (lists, maps, etc.) it became increasing difficult to correctly parse.
After searching around, I found IronPython made it easy to parse python files, so that's what I went about doing. Once I built the Abstract Syntax Tree I used PythonWalkers to identify and find all the information I need. This works perfectly for reading in data, but I don't see an easy way to push updated data into the Python file. As far as I can tell, there's no way to change the values in the AST and much less so to convert the AST back into a script file. If I'm wrong, I'd love for someone to tell me how I could do this. What I'd need to do now is search through the file until I find the correctly line, then try to push the data into the constructor, ensuring correct ordering.
Is there some obvious solution I'm not seeing? Should I just keeping working on my parser and make it support more complex data types? I really though I had it with the IronPython parser, but I didn't think about how tricky it'd be to push modified data back into the file.
Any suggestions would be appreciated

You want a source-to-source program transformation tool.
Such a tool parses a language to an internal data structure (invariably an AST), allows you to modify the AST, and then can regenerate source text from the modified AST without changing essentially anything about the source except where the AST changes were made.
Such a program transformation tool has to parse text to ASTs, and "anti-parse" (called "Prettyprint") ASTs to text. If IronPython has a prettyprinter, that's what you need.
If it doesn't, you can build one with some (maybe a lot) of effort; as you've observed,
this isn't as easy as one might think. See my answer
Compiling an AST back to source code
If that doesn't work, our DMS Software Reengineering Toolkit with its Python front end might do the trick. It has all the above properties.

Provided you can find a complete and up-to-date context free grammar file for Python, you could use CoCo/R parser generator to generate a python parser in C#.
You can add production code to the grammar file itself to populate a data structure in your C# app. Said data structure can hold all the information you need (methods and their arguments, properties, constructors, destructors etc). Once you have this data structure, its just a task of designing a front end for the user and representing this data structure in a way that makes it editable to them (this is more of a design task than a complicated programming task).
Finally, iterate through you data structure and write out a .py file.

You can use the python inspect module to print the source of an object. In your case: To print the source of your module - the file you just parsed with IronPython. I haven't checked to see if inspect works with IronPython yet, though.
As to adding stuff, well, it's a module, right? You can just add stuff to a module... I'd load the module and then alter it, use inspect to view print it and save to disk.
From your post, it looks like you're already deep in the trenches and having fun, so I'd be really happy to see a post here on how you solved this problem!

To me it sounds more like you are at the point where you shove it all into a sqlite database and start editing it that way. Hooking up some forms to edit tables is simpler for the UI. At that point you generate new python files by dumping your tables out with some formatting to provide the surrounding python scripts.
SVN / Git / whatever can merge the updated changes via the python files.
This is what I ended up doing for my project at any rate. I started using python to hook up the various items using their computed keys and then just added some forms UI to avoid editing mistakes in the python files.

best way to store/use multiple languages

If I would want to create a c# application that support multiple languages, how should I store them?
I'd probably use constants in the application as value holders.
Such as:
Console.Write(FILE_NOT_FOUND);
When compiled, it would change into the string determined by the language.
I'll probably stick to 3 languages (Danish, English, Deutsch), not that I think it matters though.
It seems to be a waste to have a class file for each language, which all is processed when the application is compiled. It would also mean that you'd have to re-compile and re-distribute the whole program every time you want to change a string.
As far as I know, hardcoded strings is a bad thing.
Maybe a text file?
English.txt
Line1: FILE_NOT_FOUND=File Not Found. Try Again
Line2
Line3
etc.
Danish.txt
Line1: FILE_NOT_FOUND=Filen blev ikke fundet. Prøv igen
Line2
Line3
etc.
and so on.
If the user selects English, it reads the text file and set the different constant values.
The last one I can think of is placing it in a SQL database.
Could you give me some input? :)
Also, I tried writing FILE _ NOT _ FOUND (without spaces, but the text editor wouldn't let me

Use a resource file. That's the standard way to handle localization.
For details, see this tutorial.
--- EDIT ---
An alternative tutorial is available here. This one uses much better naming, so it may be more clear how it works.

I think your best option is to use the built in localization resources of the .NET Framework. You can read more about the mechanics of that here.
As for using a database to store your localised elements (text, images and the like) this is certainly a common option, but I think it's mostely because developers understand getting data from a database, more than working with satellite assemblies and the like. There a number of problems with using a database, so I'll name only a few: 1) added complexity of deployment of the application 2) addtional load on the database server, 3) where do you store the localized messages to say that the database is down :)
Using some sort of text file (likely XML) also carries with it some deployment issues, but more importanly the percieved flexibility of making text changes 'on the fly' is somewhat over rated. Apart of spelling mistakes and awkward wording you'll almost always be shipping a new build as the text of your app changes.

Check out the Localization/Internationalization samples here:
http://msdn.microsoft.com/en-us/goglobal/bb688096.aspx
I've also heard good things about this book.

This topic is far too large for a reply.
The process of making a program ready for new languages is normally called "internationalization" or "i18n", and the process of taking that and actually making it run is "localization" or "l10n".
Briefly, you want to have hardcoded strings replaced by string resources, as you say, and then typically create different resource files for different languages. Assuming you're working in .NET (a fairly good assumption for C#), there's a lot of stuff Microsoft does to make it easier.
Remember that there are other localization issues than language. For example, the Danish currency symbol is probably not '$', but rather the Euro symbol, dates are almost certainly abbreviated differently, and many places use ',' for the decimal point and '.' for the thousands separator, opposite from the English practice.

What is the best way to implement precomputed data?

I have a computation that calculates a resulting percentage based on certain input. But these calculations can take quite some time, which can be annoying. Since there are about 12500 possible inputs, I thought it would be a good idea to precompute all the data, and look this up during normal program execution.
My first idea was to just create a simple file which is read at program initialization and populates some arrays. Although this will work, I would like to know if there are some other options? For example that the array is populated during compile time.
BTW, I'm writing my code in C#.

This tutorial here implements a serializer, which you can use to easily convert an object to a binary file and back. Once you have the serializer in hand, you can just create an object that holds all your data and serialize it; when you actually run your program, just deserialize the object and use it.
This has all the benefits of saving an object to the hard drive, with an implementation that is object-agnostic (meaning you don't have to write much code for any object you want to serialize) and outputs in binary (thus saving space, if that is a concern).

A file with data is probably the easiest and most flexible way to implement it.
If you wanted it in memory without having to read it from somewhere, I would write a program to output your data in C#-like CSV format suitable for copying and pasting into an array/collection initializer, and thereby generate the source code for your precomputed data.

Create a program that outputs valid C# code which initializes your lookup tables. Make this part of your build process so that it will automatically create the source file and then build the rest of your project.

As Daniel Lew said, serialize it into a binary file.
If you need speed, go for a Dictionary. A Dictionary is indexed on it's key, and should allow rapid lookup even with large amounts of data.

I would always start by considering if there was any way to avoid precomputing. If there's 12500 possible inputs, how many are required per user request ? Will all 12500 be needed at the same time or will they be spread out in time ? If you can get by with calculating a few at a time, I'd do that with lazy initialization. I prefer this solution simply because I'll have fewer issues with it in the long run. What do you do when the persistent format changes, or the data changes. How will you handle it when the file is missing or corrupted ? Persisting to a file does not create less code.
I would serialize such a file to a human-readable format if I had to persist a pre-loaded version. I'd probably use xml serialization since it's simple. But quite often there's issues of invalidation and recalculation. Do the values never change or only very infrequently ?

I agree with mquander and Trent. Use your favorite language or script to generate the whole C# file you need to define your data (no copy-pasting, that's a manual step and error-prone). Add it as a Pre-Build event in Visual Studio. You could even detect that you have an up-to-date file and avoid regeneration for most builds.
There is definitely a way to statically generate almost any data using template metaprogramming in C++, although it can be painful. It's not worth it unless you need many sets of different data in several parts of your program. I am not familiar enough with metaprogrammation in C# to evaluate the general effort in your case. You should look into that.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.