How change file coding from windows-1251 to utf-8 - c#

I have xml file and I need to convert the text that I get from it:
I just start to write code, but I don't know how to realize this:
string text = File.ReadAllText(path);
XDocument documentcode = XDocument.Load(text);

You will have to specify the correct encoding when reading:
string text = File.ReadAllText(path, Encoding.GetEncoding("windows-1251"));
XDocument documentcode = XDocument.Parse(text); // not load.
You probably don't have to do anything special when writing.

Related

How can I use C# to search an XML file for specific words?

I'm very new to C# and XML files in general, but currently I have an XML file that still has some html markup in it (&amp, ;quot;, etc.) and I want to read through the XML file and remove all of those so it becomes easily readable. I can open and print the file to the console with no issue, but I'm stumped trying to search for those specific strings and remove them.
One way to do this would be to put all the words you want to remove into an array, and then use the Replace method to replace them with empty strings:
var xmlFilePath = #"c:\temp\original.xml";
var newFilePath = #"c:\temp\modified.xml";
var wordsToRemove = new[] {"&amp", ";quot;"};
// Read existing xml file
var fileContents = File.ReadAllText(xmlFilePath);
// Remove words
foreach (var word in wordsToRemove)
{
fileContents = fileContents.Replace(word, "");
}
// Create new file with words removed
File.WriteAllText(newFilePath, fileContents);
I suppose you are looking for this: https://learn.microsoft.com/en-us/dotnet/api/system.web.httputility.htmldecode?view=netcore-3.1
Converts a string that has been HTML-encoded for HTTP transmission into a decoded string.
// Encode the string.
string myEncodedString = HttpUtility.HtmlEncode(myString);
Console.WriteLine($"HTML Encoded string is: {myEncodedString}");
StringWriter myWriter = new StringWriter();
// Decode the encoded string.
HttpUtility.HtmlDecode(myEncodedString, myWriter);
string myDecodedString = myWriter.ToString();
Console.Write($"Decoded string of the above encoded string is: {myDecodedString}");
Your string is html encoded, probably for transmission over network. So there is a built in method to decode it.

Parse the XML namespace from string variable in c#

I am getting the following request data:-
<NS2:GETREQUEST
XMLNS:NS2='HTTP://WWW..ORG/SCHEMA/NAXML/V01'
XMLNS:NS4='HTTP://WWW..ORG/SCHEMA/CORE/V01'
XMLNS:NS3='HTTP://WWW.NAXML.ORG/VOCABULARY/2020-10-16'>
<NS2:REQUESTHEADER>
<NS2:VERSION>1.1</NS2:VERSION>
<NS3:NAME>VIP</NS3:NAME>
<NS3:MODELVERSION>3.00</NS3:MODELVERSION>
<NS2:SEQUENCEID>1-101</NS2:SEQUENCEID>
<NS2:LOCATIONID>7895</NS2:LOCATIONID>
</NS2:REQUESTHEADER>
</NS2:GETREQUEST>
Now I store this data in string variable. Now I want find the SequenceID from the generated request but I am not able finding the SequenceID.
I am getting an error that while parsing xml data :-
XDocument doc = XDocument.Parse(requesttcpdata);
'NS2' is an undeclared prefix. Line 1, position 2.
Can anyone tell me how do it?
You input string looks like XML, but it isn't. Thus, you cannot parse it with an XML parser.
That having been said, it looks like your input file can be converted into real XML by lowercasing the prefix of the xmlns: attributes. To ensure that you are not accidentally modifying xmlns when it appears in the values themselves, I suggest you use a fairly strict string replacement check:
string input = #"<NS2:GETREQUEST
XMLNS:NS2='HTTP://WWW..ORG/SCHEMA/NAXML/V01'
XMLNS:NS4='HTTP://WWW..ORG/SCHEMA/CORE/V01'
XMLNS:NS3='HTTP://WWW.NAXML.ORG/VOCABULARY/2020-10-16'>
<NS2:REQUESTHEADER>
<NS2:VERSION>1.1</NS2:VERSION>
<NS3:NAME>VIPER</NS3:NAME>
<NS3:MODELVERSION>3.00</NS3:MODELVERSION>
<NS2:SEQUENCEID>1-101</NS2:SEQUENCEID>
<NS2:LOCATIONID>7895</NS2:LOCATIONID>
</NS2:REQUESTHEADER>
</NS2:GETREQUEST>";
const string brokenHeader = #"<NS2:GETREQUEST
XMLNS:NS2='HTTP://WWW..ORG/SCHEMA/NAXML/V01'
XMLNS:NS4='HTTP://WWW..ORG/SCHEMA/CORE/V01'
XMLNS:NS3='HTTP://WWW.NAXML.ORG/VOCABULARY/2020-10-16'>";
const string fixedHeader = #"<NS2:GETREQUEST
xmlns:NS2='HTTP://WWW..ORG/SCHEMA/NAXML/V01'
xmlns:NS4='HTTP://WWW..ORG/SCHEMA/CORE/V01'
xmlns:NS3='HTTP://WWW.NAXML.ORG/VOCABULARY/2020-10-16'>";
if (input.StartsWith(brokenHeader))
{
input = fixedHeader + input.Substring(brokenHeader.Length);
}
var x = XDocument.Parse(input); // works now

read Text File and Make Changes c#

i want to read a text file that Contains
<CustomerName>#CoustomerName</CoustomerName>
<CustomerAddress>#CustomerAddress</CustomerAddress>
<CustomerMobileNo>#CustomerMobileNo</CustomerMobileNo>
<Payment>#Payment</Payment>
Replace this #CoustomerName with Coustomer Name Passes During Run Time
Till then i use this
string readfile = File.ReadAllText(path);
Regex.Replace(readfile , "#CoustomerName ", objProposar.FirstName);
This works But i need to make changes in Coustomer address, mobile no etc
How can i do this
Why regex, a simple String.Replace will do the job:
string oldText = File.ReadAllText(path);
string newText = oldText.Replace("#CoustomerName", objProposar.FirstName);
// other ...
File.WriteAllText(path, newText);
If your file is XML - use XML way of doing it, like XDocument, otherwise string.Replace is a better option:
string readfile = File.ReadAllText(path);
readfile = readfile.Replace("#CoustomerName", objProposar.FirstName);

How Can I Encode My New Xml Document?

I have a string with a XML text and i want to save it like XML. I encoded string (to "utf-8") but when i want to make XML from that - my cyrillic symbols in Value don't displayed right . What i need to do to encode my XML document ?
part of my xml :
<rev:Code>Мои данные</rev:Code>
my code:
string send = Encoding.GetEncoding("utf8").GetString(Encoding.GetEncoding("utf-8").GetBytes(send));
XmlDocument docsec = new XmlDocument();
docsec.LoadXml(send);
docsec.Save("C:\\XmlNEW.xml");
Original text :Мои данные
I see it after creating XML :Мои данные
I worked with russian textfiles before to convert them to rtf and used "Encoding.GetEncoding(1251)" for that purpose.
Problem was in Save Method because it use xml encoding, i take my answer from this : Answer
XmlDocument docsec = new XmlDocument();
docsec.LoadXml(send);
using (TextWriter writer = new StreamWriter("C:\XmlNEW.xml", false, Encoding.UTF8))
docsec.Save(writer);

convert KOI8-R xml node into unicode in c#

I have the following xml:
<root>
<text><![CDATA[ОПЕЛХМЮБЮ ОПЕГ БЗПРЪЫ ЯЕ АЮПЮАЮМ, Б ЙНИРН ЯЕ]]></text>
</root>
I know this text is generated using encoding KOI8-R (this text is displayed in my text editor only when I select this encoding when I open the xml file as text) and I would like to convert the value of this node into a string usable in c#. I can read the InnerText value of this node, but it's not what I'm expecting. Can someone show me the correct way to convert a string written with this encoding into a Unicode one?
Update
Following Jon Skeet suggestions, the solution would look like this:
Encoding encoding = Encoding.GetEncoding("KOI8-R");
XmlDocument doc2 = new XmlDocument();
using (TextReader tr = new StreamReader(outputPath, encoding))
{
doc2.Load(tr);
}
How do you have that XML? It should have an XML declaration stating which encoding it's using; otherwise it's not correct simply in XML terms. You shouldn't be worrying about encodings after you've parsed the XML. So potentially something like:
Encoding encoding = Encoding.GetEncoding("KOI8-R");
XDocument doc;
using (var reader = File.OpenText("file.xml", encoding))
{
doc = XDocument.Load(reader);
}
... but as I say, the file itself should declare the encoding.

Categories

Resources