Preserve the tab spacing and white spaces in attributes of XML [duplicate] - c#

This question already has answers here:
opening/saving xml while preserving newline between node's attributes
(2 answers)
Closed 9 years ago.
I have a very basic knowledge on C# and XML. I am trying to load an XML document using XMLDocument and then edit the values of some attributes of the XML and finally save the document with the changes. The problem I am facing is, I cannot get the same formatting that i have in my original document after editing and saving it.
For example the original XML document looks like below,
<M A="XML">
<N A="XMLLINE1" B="1" C="2" D="3" D="4" />
<N A="XMLLINE2" B="5" C="6" D="7" D="8" />
</M>
After editing the value of B ="1" to B="10", I save the document. Now the spacing between the attributes A,B,C and D are not staying the same. Is there any way to preserve those spaces as they are and just edit the values and save the document?
The requirement for this document is to have those spaces as they are in original document.
Thank

You can't do this. If you want to preserve the space, better you can write your own class to generate XML by using StringBuilder or stream or you can use XMLWriter(http://msdn.microsoft.com/en-us/library/system.xml.xmlwriter_members(v=vs.71).aspx) to manually format your document.

Related

How to remove Only HTML tags in the program [duplicate]

This question already has an answer here:
Retrieving Inner Text of Html Tag C#
(1 answer)
Closed 3 years ago.
I want to remove HTML Tags with some source with C#.
Unfortunately, there are some content like <This is content>
first, I tried to Regex class like that.
Regex.Replace(htmltext,"[\\x00-\\x1f<>:\"/\\\\|?*]" +
"|^(CON|PRN|AUX|NUL|COM[0-9]|LPT[0-9]|CLOCK\\$)(\\.|$)" +
"|[\\. ]$", String.Empty);
but in this case,
"<This is content>" was removed.
so anyone, please tell me how to remove Only HTML Tags in the program.
Thanks regard.
Don't try and parse HTML with Regex. It tends not to go well.
Use a parser, HTML Agility Pack is very popular.
Using HTML agility pack you can simply call InnerText to extract the contents without HTML tags.

Set XML as Value of XElement [duplicate]

This question already has answers here:
How to avoid System.Xml.Linq.XElement escaping HTML content?
(4 answers)
Closed 7 years ago.
My method receives a XML string as the input and I need to put this XML string into XML envelope using XElement:
input: <hello>Hello!</hello>
expected result: <envelope><hello>Hello!</hello></envelope>
The problem is that this code:
string xmlHello = "<hello>Hello!</hello>";
XElement xelem = new XElement("envelope", xmlHello);
escapes all <> and so the result is:
<envelope><hello>Hello!</hello></envelope>
Is there any way to disable this behaviour of the XElement constructor to be able to accept XML as the value? The input string can be really huge, so I would like to avoid parsing it.
As mentioned in the comments, this can't be done directly as the API has no way of knowing your text is actually well formed XML unless you pass it something it knows is an XML element.
So what you need to do is parse your XML first:
string xmlHello = "<hello>Hello!</hello>";
var hello = XElement.Parse(xmlHello);
var envelope = new XElement("envelope", hello);
Resulting in:
<envelope>
<hello>Hello!</hello>
</envelope>

How can html be parsed as XML when containing '...&body='? [duplicate]

This question already has answers here:
What is the best way to parse html in C#? [closed]
(15 answers)
Closed 8 years ago.
I have html file that is a well-formed xml document (tags are paired), but contains anchor like the one below:
link
Xml parser invoked by XDocument.Load throws XmlException that says:
Additional information: '=' is an unexpected token. The expected token is ';'.
How can I instruct parser that I '&body' is not an entity? Do I must escape '&' character?
Not all HTML is going to be valid XML so you shouldn't try to parse it as such (although, in this case, it looks like you have some un-escpaped strings in the document that should probably get taken care of).
Instead, you should use something like the HTMLAgilityPack to parse your HTML and work with the document that way.

How to transfer name/value pairs to XML? [duplicate]

This question already has an answer here:
XSLT transforming name value pairs to its corresponding XML
(1 answer)
Closed 8 years ago.
I would like to transform my name/value-pairs to XML via XSLT Transformation.
This should work with the XslCompiledTranform. All this seems pretty clear. But what is the best way to use the name/value pairs? Using an XML to transform to HTML and things like that are pretty clear. I am just confused about the unstructured name/value pairs.
As Tim C says in the comments for the question, XSLT is used to transform XML documents into other documents, usually other XML documents. It can't be used to transform .NET collections, unless you first serialize them into an XML document, in which case you don't need XSLT to turn a collection into XML.

Get text from HTML [duplicate]

This question already has answers here:
How do you convert Html to plain text?
(20 answers)
Closed 1 year ago.
I need a way to get all text from my aspx files.
They may contain javascrip also but I only need this for the HTML code.
Basically I need to extract everything on Text or Value attributes, text within code, whatever...
Is there any parser API available?
Cheers!
Alex
As an alternative, you might consider playing with Linq to XML to strip the interesting stuff out.

Categories

Resources