Removing Invalid Characters from XML within XDocument - c#

I am trying to write music Tags into an XML, but it fails on invalid chars, I have tried doing a replace but I can't seem to get the syntex right.
//string pattern = "[\\~#%&*{}/:<>?|\"-]";
//string replacement = "_";
//Regex regEx = new Regex(pattern);
//string sanitized = Regex.Replace(regEx.Replace(input, replacement), #"\s+", " ");
XDocument baddoc = new XDocument
(new XElement("Corrupt",
badfiles.Select(badfile =>
new XElement("File", badfile))));
baddoc.Save("D:\\badfiles.xml");
// foreach(string musicfile in musicfiles)
//{ String Title = (TagLib.File.Create(musicfile).Tag.Title); }
XDocument doc = new XDocument
(new XElement("Songs",
musicfiles.Select(musicfile=>
new XElement("Song",
(new XElement("Title", (TagLib.File.Create(musicfile).Tag.Title))),
(new XElement("Path", (musicfile))),
(new XElement("Artist", (TagLib.File.Create(musicfile).Tag.Performers)))
))));
doc.Save("D:\\files.xml");

I ended up breaking it all out like this:
XDocument doc = new XDocument();
XElement songsElement = new XElement("Songs");
foreach(var musicfile in musicfiles)
{
XElement songElement = new XElement("Song");
string songTitle;
try { songTitle = (TagLib.File.Create(musicfile).Tag.Title); }
catch { songTitle = "Missing"; }
uint songTNint;
try { songTNint = (TagLib.File.Create(musicfile).Tag.Track); }
catch { songTNint = 00; }
string songTN = songTNint.ToString();
string songPath = musicfile;
string songArtist;
try {songArtist = (TagLib.File.Create(musicfile).Tag.Performers[0]);}
catch {songArtist = "Missing";}
List<string> songGenres = new List<string>();
foreach (string Genre in (TagLib.File.Create(musicfile).Tag.Genres))
{ songGenres.Add(Genre);}
string songGenre;
if (songGenres.Count > 1) { songGenre = (songGenres[0] + "/" + songGenres[1]); }
else { try { songGenre = songGenres[0]; } catch { songGenre = "Missing"; } }
songArtist = Regex.Replace(songArtist, #"[^\u0020-\u007E]", string.Empty);
XElement titleElement = new XElement("Title",songTitle);
XElement tnElement = new XElement("TN", songTN);
XElement pathElement = new XElement("Path", musicfile);
XElement artistElement = new XElement("Artist",songArtist);
XElement genreElement = new XElement("Genre", songGenre);
songElement.Add(titleElement);
songElement.Add(tnElement);
songElement.Add(pathElement);
songElement.Add(artistElement);
songElement.Add(genreElement);
songsElement.Add(songElement);
}

Related

C# - How to get full text content from rss feed?

I want the full content from a rss feed, not just the description.
This is what I have:
string RssFeedUrl = "http://g1.globo.com/dynamo/rss2.xml";
List<feed> feeds = new List<feed>();
try
{
XDocument xDoc = new XDocument();
xDoc = XDocument.Load(RssFeedUrl);
var items = (from x in xDoc.Descendants("item")
select new
{
title = x.Element("title").Value,
link = x.Element("link").Value,
pubDate = x.Element("pubDate").Value,
description = x.Element("description").Value
});
if (items != null)
{
foreach (var i in items)
{
feed f = new feed
{
Titulo = i.title,
Link = i.link,
DataPublicada = i.pubDate,
Descricao = i.description
};
feeds.Add(f);
}
}
gvRss.DataSource = feeds;
gvRss.DataBind();
}
catch (Exception ex)
{
throw;
}
It is just retrieving me a short excerpt, but I want the full content text.

How to return variable which in declared with in loop c#

private XElement AuthorSeparate(List<string> authorName)
{
string surName = string.Empty;
string initalName = string.Empty;
string givenName = string.Empty;
int j = 1;
for (int i = 0; i < authorName.Count; i++)
{
XElement Author = new XElement("author");
Author.Add(new XAttribute("Seq", j));
else
{
char[] initalArray = splitedName[0].ToCharArray();
initalName = initalArray[0] + '.'.ToString();
surName = splitedName.LastOrDefault();
splitedName = splitedName.Reverse().Skip(1).Reverse().ToArray();
givenName = string.Join(" ", splitedName);
}
if (!string.IsNullOrEmpty(initalName))
{
XElement InitalElement = new XElement("initials", initalName);
Author.Add(InitalElement);
}
if (!string.IsNullOrEmpty(surName))
{
XElement SurnameElement = new XElement("surname", surName);
Author.Add(SurnameElement);
}
if (!string.IsNullOrEmpty(givenName))
{
XElement GivenNameElement = new XElement("given-name", givenName);
Author.Add(GivenNameElement);
}
}
return Author;
}
This is my method.. Form this method i need to return xelement. in this method i declared xelement in for loop. after for loop completed i need to return that xelement named as author. how to return that xelement?
If you declare the variable in the loop, you can use it only inside the loop. So in your case, you should declare it outside of the loop because you can't put the return statement inside for loop.
Edit:
private ArrayList AuthorSeparate(List<string> authorName)
{
string surName = string.Empty;
string initalName = string.Empty;
string givenName = string.Empty;
int j = 1;
ArrayList authors = new ArrayList();
for (int i = 0; i < authorName.Count; i++)
{
XElement Author = new XElement("author");
Author.Add(new XAttribute("Seq", j));
else
{
char[] initalArray = splitedName[0].ToCharArray();
initalName = initalArray[0] + '.'.ToString();
surName = splitedName.LastOrDefault();
splitedName = splitedName.Reverse().Skip(1).Reverse().ToArray();
givenName = string.Join(" ", splitedName);
}
if (!string.IsNullOrEmpty(initalName))
{
XElement InitalElement = new XElement("initials", initalName);
Author.Add(InitalElement);
}
if (!string.IsNullOrEmpty(surName))
{
XElement SurnameElement = new XElement("surname", surName);
Author.Add(SurnameElement);
}
if (!string.IsNullOrEmpty(givenName))
{
XElement GivenNameElement = new XElement("given-name", givenName);
Author.Add(GivenNameElement);
}
authors.Add(Author)
}
return authors;
}
If you want to create an XElement in each round, you can put them all in an ArrayList then return it.
I believe instead of returning XElement from function you should return List of XElement.
So you can write something like this:
private List<XElement> AuthorSeparate(List<string> authorName)
{
string surName = string.Empty;
string initalName = string.Empty;
string givenName = string.Empty;
int j = 1;
var AuthorList = new List<XElement>();
for (int i = 0; i < authorName.Count; i++)
{
XElement Author = new XElement("author");
Author.Add(new XAttribute("Seq", j));
else
{
char[] initalArray = splitedName[0].ToCharArray();
initalName = initalArray[0] + '.'.ToString();
surName = splitedName.LastOrDefault();
splitedName = splitedName.Reverse().Skip(1).Reverse().ToArray();
givenName = string.Join(" ", splitedName);
}
if (!string.IsNullOrEmpty(initalName))
{
XElement InitalElement = new XElement("initials", initalName);
Author.Add(InitalElement);
}
if (!string.IsNullOrEmpty(surName))
{
XElement SurnameElement = new XElement("surname", surName);
Author.Add(SurnameElement);
}
if (!string.IsNullOrEmpty(givenName))
{
XElement GivenNameElement = new XElement("given-name", givenName);
Author.Add(GivenNameElement);
}
AuthorList.Add(Author);
}
return AuthorList;
}
It will be better to return the list in the way above because here you will get a new XElement for every iteration of the loop.

Split values from a string

I have xml data in a string and i want it to split and i want to display the result in a Lable.
Here is my code:
string param = <HCToolParameters><BatchId>12</BatchId><HCUser>Admin</HCUser</HCToolParameters>;
var a = param.Split(new string[] { "<HCToolParameters>" }, StringSplitOptions.RemoveEmptyEntries);
var b = param.Split(new string[] { "<BatchId>12</BatchId>" }, StringSplitOptions.RemoveEmptyEntries);
var c = param.Split(new string[] { "<HCUser>Admin</HCUser>" }, StringSplitOptions.RemoveEmptyEntries);
var d = param.Split(new string[] { "</HCToolParameters>" }, StringSplitOptions.RemoveEmptyEntries);
Example:
String value =
<HCToolParameters><BatchId>12</BatchId><HCUser>Admin</HCUser></HCToolParameters>
Expected Result:
<HCToolParameters>
<BatchId>12</BatchId>
<HCUser>Admin</HCUser>
</HCToolParameters>
From what I see in the begging you have valid xml so, stop spliting it and use Xml Parser !
string param =#"<HCToolParameters><BatchId>12</BatchId><HCUser>Admin</HCUser></HCToolParameters>";
XDocument doc = XDocument.Parse(param);
Console.WriteLine(doc.ToString());
Well you could do it easy by this:
value = value.Replace("><", ">" + Environment.NewLine + "<");
This would work out in your example and is easy,... if you need it as Array (I don't realy know why you would try it this way:
var array = value.Replace("><", ">#<").Split('#');
You can use XmlTextWriter.Formatting = Formatting.Indented; because what is see is, you wanted to format your XML string. This function might do the trick for you
public static String FormatMyXML(String SomeXML)
{
String Result = "";
MemoryStream mStream = new MemoryStream();
XmlTextWriter wrtr = new XmlTextWriter(mStream, Encoding.Unicode);
XmlDocument document = new XmlDocument();
try
{
document.LoadXml(SomeXML);
wrtr.Formatting = Formatting.Indented;
document.WriteContentTo(wrtr);
wrtr.Flush();
mStream.Flush();
mStream.Position = 0;
StreamReader sReader = new StreamReader(mStream);
String FormattedXML = sReader.ReadToEnd();
Result = FormattedXML;
}
catch (XmlException)
{
}
mStream.Close();
wrtr.Close();
return Result;
}

remove html tags from string using htmlagilitypack

I wonder how could i remove the html tags using htmlagilitypack as below ?
var htmlDoc = new HtmlDocument();
htmlDoc.LoadHtml(Description);
//markups to be removed
var markups = new List<string> { "br","ol","ul","li" };
thanks
you can use this method
public static string RemoveHTMLTags(string content)
{
var cleaned = string.Empty;
try
{
string textOnly = string.Empty;
Regex tagRemove = new Regex(#"<[^>]*(>|$)");
Regex compressSpaces = new Regex(#"[\s\r\n]+");
textOnly = tagRemove.Replace(content, string.Empty);
textOnly = compressSpaces.Replace(textOnly, " ");
cleaned = textOnly;
}
catch
{
//A tag is probably not closed. fallback to regex string clean.
}
return cleaned;
}
//markups to be removed
var markups = new List<string> { "br", "ol", "ul", "li" };
var xpath = String.Join(" | ", markups.Select(x => "//" + x));
var nodes = htmlDoc.DocumentNode.SelectNodes(xpath);
if (nodes != null)
{
foreach (var node in nodes)
{
node.Remove();
}
}

I want to recursively list the nodes in xml

I recursively want to display xml nodes. But unfortunately it doesn't work. The output is only the first element of the xml file. Why?
public string GetOutline(int indentLevel, XmlNode xnod)
{
StringBuilder result = new StringBuilder();
XmlNode xnodWorking;
result = result.AppendLine(new string('-', indentLevel * 2) + xnod.Name);
if (xnod.NodeType == XmlNodeType.Element)
{
if (xnod.HasChildNodes)
{
xnodWorking = xnod.FirstChild;
while (xnodWorking != null)
{
GetOutline(indentLevel + 1, xnodWorking);
xnodWorking = xnodWorking.NextSibling;
}
}
}
return result.ToString();
}
Here the code calling the function. The XML file begins with <Videos> then <Video>... etc...
private void button2_Click(object sender, EventArgs e)
{
SaveFileDialog fDialog = new SaveFileDialog();
fDialog.Title = "Save XML File";
fDialog.FileName = "drzewo.xml";
fDialog.CheckFileExists = false;
fDialog.InitialDirectory = #"C:\Users\Piotrek\Desktop";
if (fDialog.ShowDialog() == DialogResult.OK)
{
using (var newXmlFile = File.Create(fDialog.FileName));
{
string xmlTree = fDialog.FileName.ToString();
XmlDocument xdoc = new XmlDocument();
xdoc.Load(XML);
XmlNode xnodDE = xdoc.DocumentElement;
textBox2.Text = GetOutline(0, xnodDE);
//StringBuilder result = new StringBuilder();
/*
foreach (var childelement in xdoc.DescendantNodes().OfType<XElement>()
.Select(x => x.Name).Distinct())
{
result.Append(childelement + Environment.NewLine );
}
textBox2.Text = result.ToString();
*/
using (StreamWriter sw = File.AppendText(xmlTree))
{
sw.Write(textBox2.Text);
}
}
}
XML content :
<Videos>
<Video>
<Title>The Distinguished Gentleman</Title>
<Director>Jonathan Lynn</Director>
<Actors>
<Actor>Eddie Murphy</Actor>
<Actor>Lane Smith</Actor>
<Actor>Sheryl Lee Ralph</Actor>
<Actor>Joe Don Baker</Actor>
</Actors>
<Length>112 Minutes</Length>
<Format>DVD</Format>
<Rating>R</Rating>
</Video>
<Video>
<Title>Her Alibi</Title>
<Director>Bruce Beresford</Director>
<Length>94 Mins</Length>
<Format>DVD</Format>
<Rating>PG-13</Rating>
</Video>
</Videos>
You need to read all document line by line whith a for each or a while instruction
XmlReader reader = XmlReader.Create(your xml file);
reader.MoveToContent();
while (reader.Read())
{
// your code
}
reader.Close();
not the best way, try to have a look also on linq to xml
try that
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Xml;
using System.Xml.Linq;
namespace testStackOverflow
{
class Program
{
static void Main(string[] args)
{
//Load xml
XDocument xdoc = XDocument.Load("test.xml");
//Run query
var lv1s = from lv1 in xdoc.Descendants("Video")
select new
{
title = lv1.Element("Title").Value
};
//Loop through results
foreach (var lv1 in lv1s)
{
Console.WriteLine(lv1.title);
}
Console.ReadLine();
}
}
}
You're not doing anything to add the results of the recursive calls to the string you're building. You need to do this:
result.Append(GetOutline(indentLevel + 1, xnodWorking));
And this modification should avoid the text nodes and nodes with the same name:
public string GetOutline(int indentLevel, XmlNode xnod)
{
StringBuilder result = new StringBuilder();
XmlNode xnodWorking;
result = result.AppendLine(new string('-', indentLevel * 2) + xnod.Name);
if (xnod.HasChildNodes)
{
List<string> foundElements = new List<string>();
xnodWorking = xnod.FirstChild;
while (xnodWorking != null)
{
if(xnodworking.NodeType == XmlNodeType.Element && !foundElements.Contains(xnodworking.Name))
{
result.Append(GetOutline(indentLevel + 1, xnodWorking));
foundElements.Add(xnodworking.Name);
}
xnodWorking = xnodWorking.NextSibling;
}
}
return result.ToString();
}

Categories

Resources