On my search, i have seen a lot of examples on reading and writing xml files. All of them has needs setting parameters or classes on every read and write process.
Is it possible to read and write on an XML file with subroutines that taking parameters as filename, node and function?
For example of a file named xmlExample :
<node0>
<node1><name>a</name><number>b</number>
<node2><name>aa</name><number>bb</number><extra>cc</extra>
<node3><another>aa</another><sample>bb</sample>
string filename = "C:\Documents and Settings\Administrator\Desktop\xmlExample .xml"
And then adressing the wanted object hierarchically:
Read( xmlExample, node0, node1 , name)
Or addressing that object with id-like unique node:
Read(xmlExample, sample)//there will be just one "sample".
My question is clearly about non-standart read and write approaches. Do we have to assign the unnecessary parts of file all the time or once a time we write to read or write functions, is it possible to call just function with parameters?
I don't know anything ready made. However, you can quite easily create something like that. Take a look at XmlReader class, and especially the XmlReader.ReadToFollowing method
Related
I've created an interface which looks like this:
interface ICsvReader
{
List<string> ReadFromStream(Stream csvStream);
}
My question is about return type List<string>. In tutorials I can see a lot of examples where methods are just void. In that cases interface looks natural:
interface ILogger
{
void LogError(string error);
}
you don't have any specific destination of logging or method how to log errors. Like I said it looks natural for me, but what about specific types to return? Isn't it bad approach? When I'm using interface I want to create some abstraction over my methods - 'You should do this, but I don't care how'. So do you have any better idea for interface for file reader or something? I would like to read CSV from different sources but always return List<string>. Good or bad approach?
Logger is kind of writer so void; ICsvReader as name suggests it is reader meaning it is going to read something for you and give it in return.
Have you ever seen a read method with return type void? I can't remember one!
Only thing I can suggest is use IEnumerable<string> Always promise less than what you can deliver. That will help you to switch to deferred execution if required in future.
There is nothing wrong here. Since Logger does write operation it is void that's not your case you need to yield something saying "this is what I read for you".
Well, returning List<string> means that you have the whole structure in your memory. For CSV files larger that 2 G this may be not appropriate.
Another choice would be returning IEnumerable<string> — that would let a CSV-reader to decide whether it want to read the whole file at once, or do incremental loading and parsing. Or you would be able to have two different classes, one that would try to load whole file at once, and another would work step-by-step.
Of course, List<T> has methods and properties that IEnumerable<T> doesn't have, so you would have to decide whether this added flexibility is worth it. But I've seen a number of server-side plugins that would read gigantic files into memory in order to send them to the client, so I recommend at least think about this.
Regarding void vs List return type in interface
I think what approach you are taking is absolutely correct. In your case you are returning List is not incorrect, actually that is need of your application. And to do so your are declaring interface. Interface method declaration can be anything that suits your code.
As many answers suggested here for code optimization purpose use IEnumerable.
From Question:
So do you have any better idea for interface for file reader or
something?
Just suggestion, do you really need to create interface. Because definition of your ReadFromStream method in your case looks like going to be same, so you may end up writing same code in various classes. And solution will be write method in base class/ in abstract class(in which you will achieve abstraction)
I'm writing xml with XmlWriter. My code has lots of sections like this:
xml.WriteStartElement("payload");
ThirdPartyLibrary.Serialise(results, xml);
xml.WriteEndElement(); // </payload>
The problem is that the ThirdPartyLibrary.Serialise method is unreliable. It can happen (depending on the variable results) that it doesn't close all the tags it opens. As a consequence, my WriteEndElement line is perverted, consumed closing the library's hanging tags, rather than writing </payload>.
Thus I'd like to make a checked call to WriteEndElement that checks the element name, and throws an exception unless the cursor is at the expected element.
xml.WriteEndElement("payload");
You can think of this like XmlReader.ReadStartElement(name) which throws unless the cursor is at the expected place in the document.
How can I achieve this?
Edit: A second use case for this extension method would be to make my own code more readable and reliable.
XMLWriter is just writes the given xml information in the stream with out any validation. If it does any validation while writing the xml tags, the performance problem will arise while creating the big xml file.
Creating the XML file using XMLWriter is up to developer risk. If you want to do any such kind of validation, you can use XMLDocument.
If you really want to do this validation in XMLWriter, you have to create the writer by using String or StringBuilder. Because, if you use Stream or TextWriter you can't read the information which is written into the stream in the middle of writing. In Every update of the XML you have to read the string and write your own method to validate the written information.
I Suggest you to use XMLDocument for creating these type of xml.
In the end, I wrote an extention method WriteSubtree that gives this usable API:
using (var resultsXml = xml.WriteSubtree("Results"))
{
ThirdPartyLibrary.Serialise(results, resultsXml);
}
The extension method XmlWriter.WriteSubtree is analogous to .NET's XmlReader.ReadSubtree. It returns a special XmlWriter that checks against funny business. Its dispose method closes any tags left open.
I have a function which will take student details as input and write their report card into a xml file. When i tried to create unit tests in vs2008 i saw the message - "A method that does not return a value cannot be verified." My function does not return a value it merely writes into a file and returns.
How do i write tests for this function?
[TestMethod()]
public void StoreInformationTest()
{
StudentSettings target = new StudentSettings(); // TODO: Initialize to an appropriate
StudentSettings settings = null; // TODO: Initialize to an appropriate value
target.StoreInformation(settings);
Assert.Inconclusive("A method that does not return a value cannot be verified.");
}
Thanks in advance,
Regards,
John
With good separation of responsibilities, it would be easy to replace your file with something like a memorystream. Write into a memory stream instead of a file. Then you can test against the content of that. But as mentioned by others, a code example would maybe reveal other needs.
UPDATE:
Thanks for the code. It looks like your StudentSettings class is doing to much. Separate xml writing functions into its own class, extract an interface from it, and inject this into your class as a constructor argument. Then you can replace it with your own mock during tests.
First of all, if the big uncle Visual Studio tells you that your method cannot be tested, it does not have to be true.
You should return the output to be written in the file as a string, or your method should take TextWriter as a parameter. In the former case you may use mocking framework, as mentioned in the other answer, to give the method under test a fake TextWriter object.
You can use an mocking framework to do this. An good example is Moq. Simply put you can create fake-objects and you tell them to behave like another. You can also verify how often and method is called, if it is called and how often it should be called.
EDIT:
The quick startguide shown here has some good examples which probably will put you in the right direct. In your case you could create an moq of your class containing the function which writes your file. Using the verify function you can check/verify how often the function is called and if it runs without any exceptions.
The code generator is merely suggesting that it can't verify your test based on its return value (void) which makes perfect sense. I think someone else mentioned that this is more of a placeholder. When it comes to actually writing the test, you need to decide what your passing criteria really is. You can go as easy as;
Assert.IsTrue(File.Exists(filePath));
If all you care about is the file existing, or you can get deeper down into it, verify its contents and so forth. It is really up to you.
I have a function that is very small, but is called so many times that my profiler marks it as time consuming. It is the following one:
private static XmlElement SerializeElement(XmlDocument doc, String nodeName, String nodeValue)
{
XmlElement newElement = doc.CreateElement(nodeName);
newElement.InnerXml = nodeValue;
return newElement;
}
The second line (where it enters the nodeValue) is the one takes some time.
The thing is, I don't think it can be optimized code-wise, I'm still open to suggestions on that part though.
However, I remember reading or hearing somewhere that you could tell the compiler to flag this function, so that it is loaded in memory when the program starts and it runs faster.
Is this just my imagination or such a flag exists?
Thanks,
FB.
There are ways you can cause it to be jitted early, but it's not the jit time that's hurting you here.
If you're having performance problems related to Xml serialization, you might consider using XmlWriter rather than XmlDocument, which is fairly heavy. Also, most automatic serialization systems (including the built-in .NET XML Serialization) will emit code dynamically to perform the serialization, which can then be cached and re-used. Most of this has to do with avoiding the overhead of reflection, however, rather than the overhead of the actual XML writing/parsing.
I dont think this can be solved using any kind of catching or inlining. And I believe its your imagination. Mainly the part about performance. What you have in mind is pre-JIT-ing your code. This technique will remove the wait time for JITer when your function is first called. But this is only first time this function is called. It has no performance effect for subsequent calls.
As documentation states, setting InnterXml parses set string as XML. And parsing XML string can be expensive operation, especialy if set xml in string format is complex. And documentation even has this line:
InnerXml is not an efficient way to modify the DOM. There may be performance issues when replacing complex nodes. It is more efficient to construct nodes and use methods such as InsertBefore, InsertAfter, AppendChild, and RemoveChild to modify the Xml document.
So, if you are creating complex XML structure this way it would be wise to do it by hand.
as a beginner, I have formulated some ideas, but wanted to ask the community about the best way to implement the following program:
It decodes 8 different types of data file. They are all different, but most are similar (contain a lot of similar fields). In addition, there are 3 generations of system which can generate these files. Each is slightly different, but generates the same types of files.
I need to make a visual app which can read in any one of these, plot the data in a table (using datagridview via datatable at the moment) before plotting on a graph.
There is a bit more to it, but my question is regarding the basic structure.
I would love to learn more about making best use of object oriented techniques if that would suit well.
I am using c# (unless there are better recommendations) largely due to my lacking experience and quick development time.
I am currently using one class called 'log' that knows what generation/log type the file that is open is. it controls reading and exporting to a datatable. A form can then give it a path, wait for it to process the file and request the datatable to display.
Any obvious improvements?
As you have realised there is a great deal of potential in creating a very elegant OOP application here.
Your basic needs - as much as I can see from the information you have share - are:
1) A module that recognises the type of file
2) A module that can read the file and load the data into a common structure (is it going to be common structure??) this consists of handlers
3) A module that can visualise the data
For the first one, I would recommend two patterns:
1a) Factory pattern: File is passed to a common factory and is parsed to the point that it can decide the handler
2a) Chain-of-responsibility: File is passed to each handler which knows if it can support the file or not. If it cannot passes to the next one. At the end either one handler picks it up or an error will occur if the last handler cannot process it.
For the second one, I recommend to design a common interface and each handler implements common tasks such as load, parse... If visualisation is different and specific to handlers then you would have those set of methods as well.
Without knowing more about the data structure I cannot comment on the visualisation part.
Hope this helps.
UPDATE
This is the factory one - a very rough pseudocode:
Factory f = new Factory();
ILogParser parser = f.GetParser(fileName); // pass the file name so that factory inspects the content and returns appropriate handler
CommonDataStructure data = parser.Parse(fileName); // parse the file and return CommonDataStructure.
Visualiser v = new Visualiser(form1); // perhaps you want to pass the refernce of your form
v.Visualise(data); // draw pretty stuff now!
Ok, first thing - make one class for every file structure type, as a parser. Use inheritance as needed to combine common functionality.
Every file parser should have a method to identify whether it can parse a file, so you can take a file name, and just ask the parsers which thinks it can handle the data.
.NET 4.0 and the extensibility framework can allow dynamic integration of the parsers without a known determined selection graph.
The rest depends mostly on how similar the data is etc.
Okay, so the basic concept of OOP is thinking of Classes etc as Objects, straight from the offset, object orientated programming can be a tricky subject to pick up at first but the more practice you get the more easy you find it to implement programs using OOP.
Take a look here: http://msdn.microsoft.com/en-us/beginner/bb308750.aspx
So you can have a Decoder class and interface, something like this.
interface IDecoder
{
void DecodeTypeA(param1, 2, 3 etc);
void DecodeTypeB(param1, 2, 3 etc);
}
class FileDecoder : IDecoder
{
void DecodeTypeA(param1, 2, 3 etc)
{
// Some Code Here
}
void DecodeTypeB(param1, 2, 3 etc)
{
// Some Code Here
}
}