SHA1 hash translated to invalid xml - c#

I have a dictionary, which I turn in to XML and then hash with SHA1.
string xmlMessageCode = inputDictionary.ToXML(); //Extension method.
UnicodeEncoding UE = new UnicodeEncoding();
SHA1Managed hasher = SHA1Managed();
byte[] hashString = Encoding.UTF8.GetBytes(xmlMessageCode.ToCharArray());
byte[] hashCode = hasher.ComputeHash(hashString);
string computedHashString = UTF8Encoding.UTF8.GetString(hashCode);
return computedHashString;
After that I put the value in an object property and then serialize a collection of these objects to XML:
XmlSerializer ser = new XmlSerializer(typeof(T));
XmlWriterSettings settings = new XmlWriterSettings()
{
Indent = false,
OmitXmlDecleration = false,
Encoding = Encoding.UTF8
};
using(StringWriter sr = new StringWriter)
{
using(XmlWriter xmlr = XmlWriter.Create(sr, settings))
{
ser.Serialize(sr, newList);
}
return sr.ToString();
}
This produces XML, but when I try to validate the resulting XML, I get an error inside the property which was created from the hashed string.
What would be the best way to resolve this?
Should I strip the invalid characters or is there a more elegant solutions?

XML is a text based representation - you can not embed binary information directly into it.
Therefore you have to convert the binary data to a text - usually Base64 encoding is used for that purpose.
hence instead of
string computedHashString = UTF8Encoding.UTF8.GetString(hashCode);
you should use
string computedHashString = System.Convert.ToBase64String(hashCode);

Related

How to calculate an IRmark of an xml document using c#

I am fairly new coding with C#. I would like to calculate an IRmark of an xml file saved on my c drive.
The IRmark calculation is based on the HMRC's specification on https://www.gov.uk/government/publications/hmrc-irmark-generic-irmark-specification
I have found a code online that I could use to do this. I do not know how to direct it to read and convert the xml file on the c drive instead. I will appreciate your help. Thank you
Here is the code
public static string GetIRMark(byte[] Xml)
{
string vbLf = "\n";
string vbCrLf = "\r\n";
// Convert Byte array to string
string text = Encoding.UTF8.GetString(Xml);
XmlDocument doc = new XmlDocument();
doc.PreserveWhitespace = true;
doc.LoadXml(text);
XmlNamespaceManager ns = new XmlNamespaceManager(doc.NameTable);
ns.AddNamespace("env", doc.DocumentElement.NamespaceURI);
XmlNode Body = doc.SelectSingleNode("//env:Body", ns);
ns.AddNamespace("tax", Body.FirstChild.NextSibling.NamespaceURI);
// Create an XML document of just the body section
XmlDocument xmlBody = new XmlDocument();
xmlBody.PreserveWhitespace = true;
xmlBody.LoadXml(Body.OuterXml);
// Remove any existing IRMark
XmlNode nodeIr = xmlBody.SelectSingleNode("//tax:IRmark", ns);
if (nodeIr != null)
{
nodeIr.ParentNode.RemoveChild(nodeIr);
}
// Normalise the document using C14N (Canonicalisation)
XmlDsigC14NTransform c14n = new XmlDsigC14NTransform();
c14n.LoadInput(xmlBody);
using (Stream S = (Stream)c14n.GetOutput())
{
byte[] Buffer = new byte[S.Length];
// Convert to string and normalise line endings
S.Read(Buffer, 0, (int)S.Length);
text = Encoding.UTF8.GetString(Buffer);
text = text.Replace("
", "");
text = text.Replace(vbCrLf, vbLf);
text = text.Replace(vbCrLf, vbLf);
// Convert the final document back into a byte array
byte[] b = Encoding.UTF8.GetBytes(text);
// Create the SHA-1 hash from the final document
SHA1 SHA = SHA1.Create();
byte[] hash = SHA.ComputeHash(b);
return Convert.ToBase64String(hash);
}
I have tried this code
string text = Encoding.UTF8.GetString(#"c:\myfile.xml");
but I get an error message

C# Hash complete xml

We are trying to hash a xml file, i already have it working that it hashes the contents of the XML.
For which i am using the following code:
XmlDocument doc = new XmlDocument();
doc.PreserveWhitespace = true;
doc.Load(txtFile.Text);
XmlNodeList list = doc.GetElementsByTagName("Document");
XmlElement node = (XmlElement)list[0];
//node.SetAttribute("xmlns:xsi", "http://www.w3.org/2001/XMLSchema-instance");
string s = node.OuterXml;
using (MemoryStream msIn = new MemoryStream(Encoding.UTF8.GetBytes(s)))
{
XmlDsigC14NTransform t = new XmlDsigC14NTransform(true);
t.LoadInput(msIn);
using (var hash = new SHA256Managed())
{
byte[] digest = t.GetDigestedOutput(hash);
txtHash.Text = BitConverter.ToString(digest).Replace("-", String.Empty);
}
}
however, this only hashes the contents of the XML.
What i need is to hash the complete XML instead of only the contents.
If we only hash the contents, our hash doesnt compare with the control we get.
You can read the file contents without creating a XmlDocument and hash the contents:
var file = File.ReadAllBytes(txtFile.Text);
using (var hash = new SHA256Managed())
{
byte[] digest = hash.ComputeHash(file);
txtHash.Text = BitConverter.ToString(digest).Replace("-", String.Empty);
}

XML serialization national chars error

EDIT:
Doing too much.... this works for me with national chars
var xs = new XmlSerializer(typeof(ToDoItem));
var stringWriter = new StringWriter();
xs.Serialize(stringWriter, item);
var test = XDocument.Parse(stringWriter.ToString());
...where The item is the object containing strings with national chars
/EDIT
I did a project with serialization of some objects.
I copied some code from examples on this site and everything worked great, till I changed framework ASP.NET from 3.5 til 4.0... (and changed ISS7 .net setting from v2.0 to v4.0)
I am 99% sure this is the cause of the following error:
Before this change something like this:
var test = XDocument.Parse(SerializeObject("æøåAØÅ", typeof(string)));
test.Save(HttpContext.Current.Server.MapPath("test.xml"));
Would save the xml with the exact chars used.
Now it saves this:
���A��
I would like: Information on settings I might have to make in IIS7
OR
A comment on how to change the serializing methods to handle the national chars better.
This is the serialization code used.
private static String UTF8ByteArrayToString(Byte[] characters)
{
var encoding = new UTF8Encoding();
String constructedString = encoding.GetString(characters);
return (constructedString);
}
public static String SerializeObject(Object pObject, Type type)
{
try
{
String XmlizedString = null;
var memoryStream = new MemoryStream();
var xs = new XmlSerializer(type);
var xmlTextWriter = new XmlTextWriter(memoryStream, Encoding.ASCII);
xs.Serialize(xmlTextWriter, pObject);
memoryStream = (MemoryStream)xmlTextWriter.BaseStream;
XmlizedString = UTF8ByteArrayToString(memoryStream.ToArray());
return XmlizedString.Trim();
}
catch (Exception e)
{
//Console.WriteLine(e);
return null;
}
}
You save a text as using ASCII and then decode it using UTF-8 and expect that it will work? It won't. This code could never work properly, regardless of any updates or settings.
There is no need to write the XML to a MemoryStream and then decode that. Just use StringWriter:
var xs = new XmlSerializer(type);
var stringWriter = new StringWriter();
xs.Serialize(stringWriter, pObject);
return stringWriter.ToString();

Parsing UTF8 encoded data from a Web Service

I'm parsing the date from http://toutankharton.com/ws/localisations.php?l=75
As you can see, it's encoded (<name>Paris 2ème</name>).
My code is the following :
using (var reader = new StreamReader(stream, Encoding.UTF8))
{
var contents = reader.ReadToEnd();
XElement cities = XElement.Parse(contents);
var t = from city in cities.Descendants("city")
select new City
{
Name = city.Element("name").Value,
Insee = city.Element("ci").Value,
Code = city.Element("code").Value,
};
}
Isn't new StreamReader(stream, Encoding.UTF8) sufficient ?
That looks like something that happens if you take utf8-bytes and output them with a incompatible encoding like ISO8859-1. Do you know what the real character is? Going back, using ISO8859-1 to get a byte array, and UTF8 to read it, gives "è".
var input = "è";
var bytes = Encoding.GetEncoding("ISO8859-1").GetBytes(input);
var realString = Encoding.UTF8.GetString(bytes);

xml invalid character in the given encoding

I am trying to validate my xml against it's xsd and getting the error invalid character in given encoding. The code I use to validate is below:
private static void ValidatingProcess(string XSDPath, string xml)
{
MemoryStream stream =
new MemoryStream(ASCIIEncoding.Default.GetBytes(xml));
using (StreamReader SR = new StreamReader(XSDPath))
{
XmlSchema Schema = XmlSchema.Read(SR, ReaderSettings_ValidationEventHandler);
XmlReaderSettings ReaderSettings = new XmlReaderSettings();
ReaderSettings.ValidationType = ValidationType.Schema;
ReaderSettings.Schemas.Add(Schema);
ReaderSettings.ValidationEventHandler += ReaderSettings_ValidationEventHandler;
XmlReader objXmlReader = XmlReader.Create(stream, ReaderSettings);
bool notDone = true;
while (notDone)
{
notDone = objXmlReader.Read();
}
}
}
It errors on characters such as é so I guessed this was the fact UTF-8 was specified as the encoding or the way I create the MemoryStream in the above code with ASCIIEncoding. I have tried changing the encoding in both the xsd and xml to UTF-16 and the memorystream to UTF32 but it seems to have no effect. Any ideas?
Don't convert your input string to ASCII if your input string contains non-ASCII characters.
You can use a StringReader to supply your input string directly to an XmlReader:
using (var reader = XmlReader.Create(new StringReader(xml), settings)) { ...

Categories

Resources