i have this kind EDIFACT message.
UNB+IATB:1+NGI+OOS+180918:2003+Export_Dump++TR2+X'
UNH+1+IFLIRR:15:2:1A'
FDR+OM+135+160918'
FDD++INT'
REF'
STX+ACT'
IFD+++C+USD++N'
APD+:::::::ULN:SVO'
DAT+708:160918:0915+707:160918:1055'
STX+FD'
EQP+J+76W::EIFGN+OM'
EQI+++++++:::FGN'
EQD++++++A01'
SSQ+AVIH:5:5::::0:SSR'
SSQ+BIKE:5:5::::0:SSR'
SSQ+BSCT:2:2::::0:SSR+J'
SSQ+BSCT:5:3::::2:SSR+Y'
SSQ+INFT:15:10::::5:SSR'
SSQ+PETC:1:1::::0:SSR+J'
SSQ+PETC:3:3::::0:SSR+Y'
SSQ+POXY:1:1::::0:SSR'
SSQ+SPEQ:5:5::::0:SSR'
SSQ+STCR:0:0::::0:SSR+J'
SSQ+STCR:1:1::::0:SSR+Y'
SSQ+SVAN:1:1::::0:SSR+J'
SSQ+SVAN:3:3::::0:SSR+Y'
SSQ+TVLG:5:5::::0:SSR'
SSQ+TVSM:10:10::::0:SSR'
SSQ+UMNR:5:5::::0:SSR'
SSQ+WCOB:0:0::::0:SSR'
LEG+A01+NXC'
EQI+J:24:S+J:21:A+J:24:O+J:21:E'
This message continues more than about 1 million line.
I have used C# Xml Serializer and successfully parsed this message into XML file. But not correct structure.
Here's my code:
switch (keyword)
{
case "UNB":
parts = specificLine.Split(new char[] { '+', ':' }, StringSplitOptions.RemoveEmptyEntries);
serialization = new XmlSerializer(typeof(UNB));
UNB HeaderText = new UNB(parts[1], parts[2], parts[3], parts[4], parts[5], parts[6]);
writer = XmlWriter.Create(TxtWriter, settings);
serialization.Serialize(writer, HeaderText, EmptyNS);
break;
case "UNH":
parts = specificLine.Split(new char[] { '+', ':' }, StringSplitOptions.RemoveEmptyEntries);
serialization = new XmlSerializer(typeof(UNH));
UNH BodyText = new UNH(parts[1],parts[2],parts[3],parts[4],parts[5]);
writer = XmlWriter.Create(TxtWriter, settings);
serialization.Serialize(writer, BodyText, EmptyNS);
break;
case "FDR":
flightDateInformation Gr0 = new flightDateInformation();
parts = specificLine.Split(new char[] { '+'}, StringSplitOptions.RemoveEmptyEntries);
serialization = new XmlSerializer(typeof(flightDateInformation));
flightDateDesignator fdrbody = new flightDateDesignator(parts[1], parts[2], parts[3]);
Gr0.flightDateDesignator = fdrbody;
writer = XmlWriter.Create(TxtWriter, settings);
serialization.Serialize(writer, Gr0, EmptyNS);
break;
}
and this is my structure class code example:
[XmlRoot(ElementName = "UNB", IsNullable = false), Serializable]
public class UNB
{
[XmlAttribute]
public string identifier;
[XmlAttribute]
public string version;
[XmlAttribute]
public string sender;
[XmlAttribute]
public string recipient;
[XmlAttribute]
public string dateofpreparation;
[XmlAttribute]
public string timeofpreparation;
public UNB(string identifier, string version,string sender, string recipient, string dateofpreparation, string timeofpreparation)
{
this.identifier = identifier;
this.version = version;
this.sender = sender;
this.recipient = recipient;
this.dateofpreparation = dateofpreparation;
this.timeofpreparation = timeofpreparation;
}
public UNB()
{
}
}
And my output XML file like this :
<UNB identifier="IATB" version="1" sender="NGI" recipient="OOS" dateofpreparation="180918" timeofpreparation="2003" /><UNH identifier="1" type="IFLIRR" version="15" release="2" agency="1A" /><flightDateInformation>
<flightDateDesignator airlineCode="OM" flightNumber="135" departureDate="160918" />
</flightDateInformation><flightLevelInfo flightCharacteristics="INT" /><referenceInfomation /><flightFlags statusIndicator="ACT" /><inventoryParametersFD controlType="C" currencyCode="USD" isUnderActiveRevControl="N" /><additionalproductdetails>
<departureLocation>ULN</departureLocation>
<arrivalLocation>SVO</arrivalLocation>
</additionalproductdetails><scheduledTiming>
<qualifier>708</qualifier>
<date>160918</date>
<time>0915</time>
</scheduledTiming><scheduledTiming>
<qualifier>707</qualifier>
<date>160918</date>
<time>1055</time>
</scheduledTiming><dcsInformation statusIndicator="FD" /><aircraftInformation serviceType="J" aircraftType="76W">
<eqtRegistrationNumber>EIFGN</eqtRegistrationNumber>
<aircraftOwner>OM</aircraftOwner>
</aircraftInformation><acvInformation acvCode="FGN" /><saleableConfiguration configurationCode="A01" />
<newSSR quotaCounterName="AVIH">
<maxQuantity>5</maxQuantity>
<availability>5</availability>
<counter>0</counter>
<quotaType>SSR</quotaType>
</newSSR><newSSR quotaCounterName="BIKE">
<maxQuantity>5</maxQuantity>
<availability>5</availability>
<counter>0</counter>
<quotaType>SSR</quotaType>
</newSSR>
<newSSR quotaCounterName="BSCT" cabinCode="J">
<maxQuantity>2</maxQuantity>
<availability>2</availability>
<counter>0</counter>
<quotaType>SSR</quotaType>
</newSSR>
Now my problem is : Yes my code has worked and parsed successfully into XML file. But not as i want. Each node with only 1 line.
It's my wanted structure.
Each node has included to other parent node. Some nodes expand into other nodes. my output XML don't have any parent.
Can i solve this by improving my code or should try different way?
If you have any need more details, please kindly ask me? i will give you more details
UPDATE: I'm resolved this problem.
This question is very broad. Basically you have to understand the format, then write a software to extract and convert it to your desired format. Luckily you are not the first one with this problem and there are openSource solutions available:
Is there any good open source EDIFACT parser in Java?
I would want to see a specification of the input format, not just an example, before tackling this task, especially as the quantity of data to be converted is too large to check the correctness of the result by visual inspection.
I think you are on the right lines, however: first do a crude parse of the input that produces some kind of XML representation. Then use XML tools (specifically, XSLT) to transform this crude XML into the target XML that you actually want.
I can't tell from your "actual output" and the diagram of your "desired output" what the detailed transformation rules are, but it's likely to be some kind of grouping transformation to create a hierarchic structure from a flat structure. That's a common task in XSLT and is best tacked by getting hold of an XSLT 2.0 (or 3.0) processor and using the <xsl:for-each-group> instruction. For example, if your task is to put wrapper elements around adjacent elements having the same name, you could do:
<xsl:for-each-group select="*" group-adjacent="name()">
<xsl:choose>
<xsl:when test="name()="SSR">
<SSR-LIST><xsl:copy-of select="current-group()"/></SSR-LIST>
</xsl:when>
....
<xsl:otherwise>
<xsl:copy-of select="current-group()"/>
</xsl:otherwise>
</xsl:choose>
</xsl:for-each-group>
If you want more specific advice on this transformation, I suggest posting a new question with a concrete (and short!) example of the input and output, expressed as XML documents, with a clear relationship between the two.
Related
We have a long living app that uses some feed that used to be xml, but was converted to json...
Of course we were "to lazy" to change parser from reading XmlDocument to read JObject or other so we used "DeserializeXmlNode" to convert from json txt to XmlDocument.
All was fine for a long long time... until we updated from Newtonsoft.Json versions 4.5 and 6.0 to version 12.0.x and suddenly we started to have some problems...
let's say json looks like this:
{"version":"2.0","result":[{"mainobid":"123","typeId":"2","subobjects":{"1":{"data":"data"},"2":{"data":"data"}}}]}
what we used to get was xml having
<1><data>data</data></1><2><data>data</data></2>
tags
now... instead of <1> tag we get something like <x0031>
instead of 10 there's _x0031_0
instead of 45 there's 0x0034_5
and instead of 100 _x0031_00
Can I turn that off somehow? or am I forced now to change parsing to decode that sick x003.... thing?
INB4 1: I realize that having 1: and <1> is not the thing that anyone sane wishes to have, but i can't change that, it's external feed
INB4 2: I know we should change parsing from xml to json, but as above - some lazines and re-using old code that was working 100% good.
EDIT:
private static void TestOldNewton()
{
var jsonstr = "{\"version\":\"2.0\",\"result\":[{\"mainobid\":\"123\",\"typeId\":\"2\",\"subobjects\":{\"1\":{\"data\":\"data\"},\"2\":{\"data\":\"data\"}}}]}";
var doc = Newtonsoft.Json.JsonConvert.DeserializeXmlNode(jsonstr, "data");
Console.WriteLine(doc.OuterXml);
Console.ReadKey();
}
using packages.config like:
<?xml version="1.0" encoding="utf-8"?>
<packages>
<package id="Newtonsoft.Json" version="6.0.1" targetFramework="net48" />
</packages>
and receiving output:
<data><version>2.0</version><result><mainobid>123</mainobid><typeId>2</typeId><subobjects><1><data>data</data></1><2><data>data</data></2></subobjects></result></data>
freshly compiled and run on new, testing project.
The cause of the change is the following checkin: Fixed converting JSON to XML with invalid XML name characters to Json.NET 8.0.1. This checkin added (among other changes) calls to XmlConvert.EncodeName() inside XmlNodeConverter.CreateElement():
private IXmlElement CreateElement(string elementName, IXmlDocument document, string? elementPrefix, XmlNamespaceManager manager)
{
string encodeName = EncodeSpecialCharacters ? XmlConvert.EncodeLocalName(elementName) : XmlConvert.EncodeName(elementName);
string ns = StringUtils.IsNullOrEmpty(elementPrefix) ? manager.DefaultNamespace : manager.LookupNamespace(elementPrefix);
IXmlElement element = (!StringUtils.IsNullOrEmpty(ns)) ? document.CreateElement(encodeName, ns) : document.CreateElement(encodeName);
return element;
}
This was done to [add] support for converting JSON to XML with invalid XML name characters. This applies here because element names beginning with numerals such as <1> are not well-formed XML element names, as explained in XML tagname starting with number is not working. And in fact the XML you were previously generating was not, strictly speaking, well-formed XML.
As you can see from the code excerpt above, there doesn't seem to be a way to disable this change and create elements names without encoding them.
As a workaround, since you want to create elements with numeric names like <1> anyway, you could subclass XmlTextWriter and decode the names as they are written by calling XmlConvert.DecodeName()
This method does the reverse of the EncodeName(String) and EncodeLocalName(String) methods.
First define the following class:
public class NameEditingXmlTextWriter : XmlTextWriter
{
readonly Func<string, string, string> nameEditor;
public NameEditingXmlTextWriter(TextWriter writer, Func<string, string, string> nameEditor)
: base(writer)
{
this.nameEditor = nameEditor;
}
public override void WriteStartElement(string prefix, string localName, string ns)
{
var newLocalName = nameEditor(localName, ns);
base.WriteStartElement(prefix, newLocalName, ns);
}
}
Then use it as follows:
var doc = Newtonsoft.Json.JsonConvert.DeserializeXmlNode(jsonstr, "root");
var sb = new StringBuilder();
using (var textWriter = new StringWriter(sb))
using (var writer = new NameEditingXmlTextWriter(textWriter, (n, ns) => XmlConvert.DecodeName(n)))
{
doc.WriteTo(writer);
}
var outerXml = sb.ToString();
Notes:
You must subclass the deprecated XmlTextWriter instead of its replacement XmlWriter because XmlWriter will throw an exception on an attempt to write a malformed XML element name such as <1>.
As an alternative, since Json.NET is currently licensed under the MIT License, you could fork your own version of XmlNodeConverter and remove the calls to XmlConvert.EncodeName() from CreateElement(). However, this solution seems less desirable as it creates a maintenance requirement to keep your forked version up-to-date with Newtonsoft's version.
Demo fiddle here.
I am very new to programming and not sure where I am going wrong. I have read the other threads with similar error, but I think my problem is even basic.
I get a string generated which contains XML, but it doesnt start with an XML. When I try to parse that string I get the above error.
Is there a way of getting rid of the text and save the text from where the XML starts?
My string:
{"Id":"6a76f781-f592-4320-a116-6ab289505423","Name":"Test - A","AttachmentRequired":false,"FormXml":"
<?xml version=\"1.0\" encoding=\"utf-16\"?>
The easiest way would be to use a JSON parser like Newtonsoft:
public class Data
{
public string Id;
public string Name;
public bool AttachmentRequired;
public string FormXml;
}
var o = JsonConvert.DeserializeObject<Data>(json);
var xml = o.FormXml;
Here is the Nuget package to Newtonsoft which I demonstrated above:
https://www.nuget.org/packages/Newtonsoft.Json/
If you absolutely can't use an external library to transform it into a CLR object, here is how you would do it through string manipulation:
var str = #"{ ""Id"":""6a76f781-f592-4320-a116-6ab289505423"",""Name"":""Test - A"",""AttachmentRequired"":false,""FormXml"":""<?xml version=\""1.0\"" encoding=\""utf-16\""?>""}";
var parts = str.Split(':');
var last = parts[parts.Length -1];
var xml = last.Replace("}","").Replace("\"<","<").Replace(">\"",">");
your string appears to be in json format and xml part of it is a field value for "formxml". screenshot
Easy way is to deserialize the string into object using newtonsoft json, and then parse the value of formxml to your object.
JsonConvert.DeserializeObject<YourClass>(yourstring);
I'm doing an XML reading process in my project. Where I have to read the contents of an XML file. I have achieved it.
Just out of curiosity, I also tried using the same by keeping the XML content inside a string and then read only the values inside the elemet tag. Even this I have achieved. The below is my code.
string xml = <Login-Form>
<User-Authentication>
<username>Vikneshwar</username>
<password>xxx</password>
</User-Authentication>
<User-Info>
<firstname>Vikneshwar</firstname>
<lastname>S</lastname>
<email>xxx#xxx.com</email>
</User-Info>
</Login-Form>";
XDocument document = XDocument.Parse(xml);
var block = from file in document.Descendants("client-authentication")
select new
{
Username = file.Element("username").Value,
Password = file.Element("password").Value,
};
foreach (var file in block)
{
Console.WriteLine(file.Username);
Console.WriteLine(file.Password);
}
Similarly, I obtained my other set of elements (firstname, lastname, and email). Now my curiosity draws me again. Now I'm thinking of doing the same using the string functions?
The same string used in the above code is to be taken. I'm trying not to use any XMl related classes, that is, XDocument, XmlReader, etc. The same output should be achieved using only string functions. I'm not able to do that. Is it possible?
Don't do it. XML is more complex than can appear the case, with complex rules surrounding nesting, character-escaping, named-entities, namespaces, ordering (attributes vs elements), comments, unparsed character data, and whitespace. For example, just add
<!--
<username>evil</username>
-->
Or
<parent xmlns=this:is-not/the/data/you/expected">
<username>evil</username>
</parent>
Or maybe the same in a CDATA section - and see how well basic string-based approaches work. Hint: you'll get a different answer to what you get via a DOM.
Using a dedicated tool designed for reading XML is the correct approach. At the minimum, use XmlReader - but frankly, a DOM (such as your existing code) is much more convenient. Alternatively, use a serializer such as XmlSerializer to populate an object model, and query that.
Trying to properly parse xml and xml-like data does not end well.... RegEx match open tags except XHTML self-contained tags
You could use methods like IndexOf, Equals, Substring etc. provided in String class to fulfill your needs, for more info Go here,
Using Regex is a considerable option too.
But it's advisable to use XmlDocument class for this purpose.
It can be done without regular expressions, like this:
string[] elementNames = new string[]{ "<username>", "<password>"};
foreach (string elementName in elementNames)
{
int startingIndex = xml.IndexOf(elementName);
string value = xml.Substring(startingIndex + elementName.Length,
xml.IndexOf(elementName.Insert(1, "/"))
- (startingIndex + elementName.Length));
Console.WriteLine(value);
}
With a regular expression:
string[] elementNames2 = new string[]{ "<username>", "<password>"};
foreach (string elementName in elementNames2)
{
string value = Regex.Match(xml, String.Concat(elementName, "(.*)",
elementName.Insert(1, "/"))).Groups[1].Value;
Console.WriteLine(value);
}
Of course, the only recommended thing is to use the XML parsing classes.
Build an extension method that will get the text between tags like this:
public static class StringExtension
{
public static string Between(this string content, string start, string end)
{
int startIndex = content.IndexOf(start) + start.Length;
int endIndex = content.IndexOf(end);
string result = content.Substring(startIndex, endIndex - startIndex);
return result;
}
}
I need to send email notifications to users and I need to allow the admin to provide a template for the message body (and possibly headers, too).
I'd like something like string.Format that allows me to give named replacement strings, so the template can look like this:
Dear {User},
Your job finished at {FinishTime} and your file is available for download at {FileURL}.
Regards,
--
{Signature}
What's the simplest way for me to do that?
Here is the version for those of you who can use a new version of C#:
// add $ at start to mark string as template
var template = $"Your job finished at {FinishTime} and your file is available for download at {FileURL}."
In a line - this is now a fully supported language feature (string interpolation).
You can use the "string.Format" method:
var user = GetUser();
var finishTime = GetFinishTime();
var fileUrl = GetFileUrl();
var signature = GetSignature();
string msg =
#"Dear {0},
Your job finished at {1} and your file is available for download at {2}.
Regards,
--
{3}";
msg = string.Format(msg, user, finishTime, fileUrl, signature);
It allows you to change the content in the future and is friendly for localization.
Use a templating engine. StringTemplate is one of those, and there are many.
Example:
using Antlr.StringTemplate;
using Antlr.StringTemplate.Language;
StringTemplate hello = new StringTemplate("Hello, $name$", typeof(DefaultTemplateLexer));
hello.SetAttribute("name", "World");
Console.Out.WriteLine(hello.ToString());
I wrote a pretty simple library, SmartFormat which meets all your requirements. It is focused on composing "natural language" text, and is great for generating data from lists, or applying conditional logic.
The syntax is extremely similar to String.Format, and is very simple and easy to learn and use. Here's an example of the syntax from the documentation:
Smart.Format("{Name}'s friends: {Friends:{Name}|, |, and}", user)
// Result: "Scott's friends: Michael, Jim, Pam, and Dwight"
The library has great error-handling options (ignore errors, output errors, throw errors) and is open source and easily extensible, so you can also enhance it with additional features too.
Building on Benjamin Gruenbaum's answer, in C# version 6 you can add a # with the $ and pretty much use your code as it is, e.g.:
var text = $#"Dear {User},
Your job finished at {FinishTime} and your file is available for download at {FileURL}.
Regards,
--
{Signature}
";
The $ is for string interpolation: https://learn.microsoft.com/en-us/dotnet/csharp/language-reference/tokens/interpolated
The # is the verbatim identifier: https://learn.microsoft.com/en-us/dotnet/csharp/language-reference/tokens/verbatim
...and you can use these in conjunction.
:o)
A very simple regex-based solution. Supports \n-style single character escape sequences and {Name}-style named variables.
Source
class Template
{
/// <summary>Map of replacements for characters prefixed with a backward slash</summary>
private static readonly Dictionary<char, string> EscapeChars
= new Dictionary<char, string>
{
['r'] = "\r",
['n'] = "\n",
['\\'] = "\\",
['{'] = "{",
};
/// <summary>Pre-compiled regular expression used during the rendering process</summary>
private static readonly Regex RenderExpr = new Regex(#"\\.|{([a-z0-9_.\-]+)}",
RegexOptions.IgnoreCase | RegexOptions.Compiled);
/// <summary>Template string associated with the instance</summary>
public string TemplateString { get; }
/// <summary>Create a new instance with the specified template string</summary>
/// <param name="TemplateString">Template string associated with the instance</param>
public Template(string TemplateString)
{
if (TemplateString == null) {
throw new ArgumentNullException(nameof(TemplateString));
}
this.TemplateString = TemplateString;
}
/// <summary>Render the template using the supplied variable values</summary>
/// <param name="Variables">Variables that can be substituted in the template string</param>
/// <returns>The rendered template string</returns>
public string Render(Dictionary<string, object> Variables)
{
return Render(this.TemplateString, Variables);
}
/// <summary>Render the supplied template string using the supplied variable values</summary>
/// <param name="TemplateString">The template string to render</param>
/// <param name="Variables">Variables that can be substituted in the template string</param>
/// <returns>The rendered template string</returns>
public static string Render(string TemplateString, Dictionary<string, object> Variables)
{
if (TemplateString == null) {
throw new ArgumentNullException(nameof(TemplateString));
}
return RenderExpr.Replace(TemplateString, Match => {
switch (Match.Value[0]) {
case '\\':
if (EscapeChars.ContainsKey(Match.Value[1])) {
return EscapeChars[Match.Value[1]];
}
break;
case '{':
if (Variables.ContainsKey(Match.Groups[1].Value)) {
return Variables[Match.Groups[1].Value].ToString();
}
break;
}
return string.Empty;
});
}
}
Usage
var tplStr1 = #"Hello {Name},\nNice to meet you!";
var tplStr2 = #"This {Type} \{contains} \\ some things \\n that shouldn't be rendered";
var variableValues = new Dictionary<string, object>
{
["Name"] = "Bob",
["Type"] = "string",
};
Console.Write(Template.Render(tplStr1, variableValues));
// Hello Bob,
// Nice to meet you!
var template = new Template(tplStr2);
Console.Write(template.Render(variableValues));
// This string {contains} \ some things \n that shouldn't be rendered
Notes
I've only defined \n, \r, \\ and \{ escape sequences and hard-coded them. You could easily add more or make them definable by the consumer.
I've made variable names case-insensitive, as things like this are often presented to end-users/non-programmers and I don't personally think that case-sensitivity make sense in that use-case - it's just one more thing they can get wrong and phone you up to complain about (plus in general if you think you need case sensitive symbol names what you really need are better symbol names). To make them case-sensitive, simply remove the RegexOptions.IgnoreCase flag.
I strip invalid variable names and escape sequences from the result string. To leave them intact, return Match.Value instead of the empty string at the end of the Regex.Replace callback. You could also throw an exception.
I've used {var} syntax, but this may interfere with the native interpolated string syntax. If you want to define templates in string literals in you code, it might be advisable to change the variable delimiters to e.g. %var% (regex \\.|%([a-z0-9_.\-]+)%) or some other syntax of your choosing which is more appropriate to the use case.
You could use string.Replace(...), eventually in a for-each through all the keywords. If there are only a few keywords you can have them on a line like this:
string myString = template.Replace("FirstName", "John").Replace("LastName", "Smith").Replace("FinishTime", DateTime.Now.ToShortDateString());
Or you could use Regex.Replace(...), if you need something a bit more powerful and with more options.
Read this article on codeproject to view which string replacement option is fastest for you.
In case someone is searching for an alternative -- an actual .NET one:
https://github.com/crozone/FormatWith | https://www.nuget.org/packages/FormatWith
A nice simple extendable solution. Thank you crozone!
So using the string extension provided in FormatWith here are two examples:
static string emailTemplate = #"
Dear {User},
Your job finished at {FinishTime} and your file is available for download at {FileURL}.
--
{Signature}
";
//////////////////////////////////
/// 1. Use a dictionary that has the tokens as keys with values for the replacement
//////////////////////////////////
public void TestUsingDictionary()
{
var emailDictionary = new Dictionary<string, object>()
{
{ "User", "Simon" },
{ "FinishTime", DateTime.Now },
{ "FileUrl", new Uri("http://example.com/dictionary") },
{ "Signature", $"Sincerely,{Environment.NewLine}Admin" }
};
var emailBody = emailTemplate.FormatWith(emailDictionary);
System.Console.WriteLine(emailBody);
}
//////////////////////////////////
/// 2. Use a poco with properties that match the replacement tokens
//////////////////////////////////
public class MessageValues
{
public string User { get; set; } = "Simon";
public DateTime FinishTime { get; set; } = DateTime.Now;
public Uri FileURL { get; set; } = new Uri("http://example.com");
public string Signature { get; set; } = $"Sincerely,{Environment.NewLine}Admin";
}
public void TestUsingPoco()
{
var emailBody = emailTemplate.FormatWith(new MessageValues());
System.Console.WriteLine(emailBody);
}
It allows formatting the replacement inline as well. For example, try changing {FinishTime} to {FinishTime:HH:mm:ss} in emailTemplate.
Actually, you can use XSLT.
You create a simple XML template:
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:msxsl="urn:schemas-microsoft-com:xslt" exclude-result-prefixes="msxsl">
<xsl:template match="TETT">
<p>
Dear <xsl:variable name="USERNAME" select="XML_PATH" />,
Your job finished at <xsl:variable name="FINISH_TIME" select="XML_PATH" /> and your file is available for download at <xsl:variable name="FILE_URL" select="XML_PATH" />.
Regards,
--
<xsl:variable name="SIGNATURE" select="XML_PATH" />
</p>
</xsl:template>
Then create a XmlDocument to perform transformation against:
XmlDocument xmlDoc = new XmlDocument();
XmlNode xmlNode = xmlDoc .CreateNode(XmlNodeType.Element, "EMAIL", null);
XmlElement xmlElement= xmlDoc.CreateElement("USERNAME");
xmlElement.InnerXml = username;
xmlNode .AppendChild(xmlElement); ///repeat the same thing for all the required fields
xmlDoc.AppendChild(xmlNode);
After that, apply the transformation:
XPathNavigator xPathNavigator = xmlDocument.DocumentElement.CreateNavigator();
StringBuilder sb = new StringBuilder();
StringWriter sw = new StringWriter(sb);
XmlTextWriter xmlWriter = new XmlTextWriter(sw);
your_xslt_transformation.Transform(xPathNavigator, null, xmlWriter);
return sb.ToString();
Implementing your own custom formatter might be a good idea.
Here's how you do it. First, create a type that defines the stuff you want to inject into your message. Note: I'm only going to illustrate this with the User part of your template...
class JobDetails
{
public string User
{
get;
set;
}
}
Next, implement a simple custom formatter...
class ExampleFormatter : IFormatProvider, ICustomFormatter
{
public object GetFormat(Type formatType)
{
return this;
}
public string Format(string format, object arg, IFormatProvider formatProvider)
{
// make this more robust
JobDetails job = (JobDetails)arg;
switch (format)
{
case "User":
{
return job.User;
}
default:
{
// this should be replaced with logic to cover the other formats you need
return String.Empty;
}
}
}
}
Finally, use it like this...
string template = "Dear {0:User}. Your job finished...";
JobDetails job = new JobDetails()
{
User = "Martin Peck"
};
string message = string.Format(new ExampleFormatter(), template, job);
... which will generate the text "Dear Martin Peck. Your job finished...".
If you need something very powerful (but really not the simplest way) you can host ASP.NET and use it as your templating engine.
You'll have all the power of ASP.NET to format the body of your message.
If you are coding in VB.NET you can use XML literals. If you are coding in C# you can use ShartDevelop to have files in VB.NET in the same project as C# code.
I have an XML reader on this XML string:
<?xml version="1.0" encoding="UTF-8" ?>
<story id="1224488641nL21535800" date="20 Oct 2008" time="07:44">
<title>PRESS DIGEST - PORTUGAL - Oct 20</title>
<text>
<p> LISBON, Oct 20 (Reuters) - Following are some of the main
stories in Portuguese newspapers on Monday. Reuters has not
verified these stories and does not vouch for their accuracy. </p>
<p>More HTML stuff here</p>
</text>
</story>
I created an XSD and a corresponding class for deserialization.
[System.Xml.Serialization.XmlRootAttribute(Namespace="", IsNullable=false)]
public class story {
[System.Xml.Serialization.XmlAttributeAttribute()]
public string id;
[System.Xml.Serialization.XmlAttributeAttribute()]
public string date;
[System.Xml.Serialization.XmlAttributeAttribute()]
public string time;
public string title;
public string text;
}
I then create an instance of the class using the Deserialize method of XmlSerializer.
XmlSerializer ser = new XmlSerializer(typeof(story));
return (story)ser.Deserialize(xr);
Now, the text member of story is always null. How do I change my story class so that the XML is parsed as expected?
EDIT:
Using an XmlText does not work and I have no control over the XML I'm parsing.
I found a very unsatisfactory solution.
Change the class like this (ugh!)
// ...
[XmlElement("HACK - this should never match anything")]
public string text;
// ...
And change the calling code like this (yuck!)
XmlSerializer ser = new XmlSerializer(typeof(story));
string text = string.Empty;
ser.UnknownElement += delegate(object sender, XmlElementEventArgs e) {
if (e.Element.Name != "text")
throw new XmlException(
string.Format(CultureInfo.InvariantCulture,
"Unknown element '{0}' cannot be deserialized.",
e.Element.Name));
text += e.Element.InnerXml;
};
story result = (story)ser.Deserialize(xr);
result.text = text;
return result;
This is a really bad way of doing it because it breaks encapsulation. Is there a better way of doing it?
The suggestion that I was going to make if the text tag only ever contained p tags was the following, it may be useful in the short term.
Instead of story having the text field as a string, you could have it as an array of strings. You could then use the right XmlArray attributes (can't remember the exact names, something like XmlArrayItemAttribute), with the right parameters to make it look like:
<text>
<p>blah</p>
<p>blib</p>
</text>
Which is a step closer, but not completely what you need.
Another option is to make a class like:
public class Text //Obviously a bad name for a class...
{
public string[] p;
public string[] pre;
}
And again use the XmlArray attributes to get it to look right, not sure if they are as configurable as that because I've only used them for simple types before.
Edit:
Using:
[System.Xml.Serialization.XmlRootAttribute(Namespace = "", IsNullable = false)]
public class story
{
[System.Xml.Serialization.XmlAttributeAttribute()]
public string id;
[System.Xml.Serialization.XmlAttributeAttribute()]
public string date;
[System.Xml.Serialization.XmlAttributeAttribute()]
public string time;
public string title;
[XmlArrayItem("p")]
public string[] text;
}
Works well with the supplied XML, but having the class seems a little more complicated. It ends up as something similar to:
<text>
<p>
<p>qwertyuiop</p>
<p>asdfghjkl</p>
</p>
<pre>
<pre>stuff</pre>
<pre>nonsense</pre>
</pre>
</text>
which is obviously not what is desired.
You could implement IXmlSerializable for your class and handle the inner elements there, this means that you keep the code for deserializing your data inside the target class (thus avoiding your problem with encapsulation). It's a simple enough data type that the code should be trivial to write.
Looks to me that the XML is incorrect.
Since you use HTML tags within the text tag the HTML tags are interpreted as XML.
You should use CDATA to correctly interpret the data or escape < and >.
Since you do not have control over the XML you could use StreamReader instead.
XmlReader interprets the HTML tags as XML which is not what you want.
XmlSerializer will however strip the HTML tags within the text tag.
Perhaps using the XmlAnyElement attribute instead of handling the UnknownElement event may be more elegant.
Have you tried xsd.exe? It allows you to create xsd's from xml doc's and then generate classes from the xsd that should be ripe for xml deserialization.
I encountered this same issue after using XSD.exe to generate XSD from XML and then XSD to classes. I added an [XmlText] tag before the class of the object in the generated class file (called P in my case because of the <p> tag it was inferring as an XML node) and it worked instantly. pulling in the complete HTML content that was inside the parent node and putting in that P object, which I then renamed to something more useful.