Im converting an OBJ-C application to C# and am having trouble with this one:
What is the C# way to do this:
NSArray *charts = [xmlString componentsSeparatedByString:#"</record>"];
string[] charts = xmlString.Split(new string[] { "</record>" }, StringSplitOptions.None);
I misread the original question (or rather, comment) but I would strongly recommend that if you have some XML, you don't just split it by tag names - you parse it as XML, and then work with the parsed document. That will be much more reliable than using plain string operations.
For example, if you want to get the text within each <record> element you might use:
XDocument doc = XDocument.Parse(text);
List<string> records = doc.Descendants("record")
.Select(x => x.Value)
.ToList();
Treating XML as a plain string is almost always a bad idea.
Related
So I fetch a string from a website via code from another question I posted here. This works really well when I put it into a rich textbox, but, now I need to split the string into seperate sentences in a list/array (suppose list will be easier, since you don't need to determine how long the input is going to be).
Yesterday I found the following code at another question (didn't note the question, sorry):
List<string> list = new List<string>(Regex.Split(lyrics, Environment.NewLine));
But the input is now spliting into two parts, the first three sentences and the rest.
I retrieve the text from musixmatch.com with the following code (added fixed url for simplicity):
var source = "https://www.musixmatch.com/lyrics/Krewella/Alive";
var htmlWeb = new HtmlWeb();
var documentNode = htmlWeb.Load(source).DocumentNode;
var findclasses = documentNode
.Descendants("p")
.Where(d => d.Attributes["class"]?.Value.Contains("mxm-lyrics__content") == true);
var text = string.Join(Environment.NewLine, findclasses.Select(x => x.InnerText));
More information about this code can be found here. What it does in a nutshell is it retrieves specific html that has the lyrics in it. I need to split the lyrics line by line for a synchronization process that I'm building (just like was built-in in Spotify a while ago). I need something (preferably an list/array) that I can index because that would make the database to store all this data a bit smaller. What am I supposed to use for this process?
Edit:
Answer to the mark of a possible duplicate:
C# Splitting retrieved string to list/array
You can split by both:
var lines = string.Split(new char[] { '\r', '\n' }, StringSplitOptions.RemoveEmptyEntries);
What I would do is to ensure that there is a common concept of "NewLine" in the code. It could be \r, \n or \r\n. Simply replace all '\n' with "". (Edited this one)
Now, all you have to do is
var lyricLines = lyricsWithCommonNewLine.Split('\r')
How to convert "ö" to "ö" with C#?
I try to convert with WebUtility.HtmlEncode and
HttpUtility.HtmlEncode methods, but they return "ö".
Thanks!
Per this site (https://code.google.com/p/doctype-mirror/wiki/OumlCharacterEntity) the ö character maps to the unicode value of U+000F6 which is exactly the same as 0x246 (what .NET uses). Basically what .NET gives and what you are looking for are the same, then.
If you favor ö semantically for some reason you would have to create an array of each of the replacements you want to make. From there you can use string.Replace on your html. If memory or performance are an issue you will probably need to look into using a StringBuilder. The LINQ version of string.Replace looks something like:
var myHtml = "long string with ö";
var encodedString = HttpContext.Current.Server.HtmlEncode(myHtml);
var replaceValues = new [] { new KeyValuePair<string, string>("ö", "ö") };
var encodedString = replaceValues.Aggregate(encodedString, (current, value) =>
current.Replace(value.Key, value.Value));
This is just pseudocode using LINQ and you may be able to optimize slightly but it gives you the basic idea. Best of luck!
We are trying to use urls for complex querying and filtering.
I managed to get some of the simpler parst working using expression trees and a mix of regex and string manipulation but then we looked at a more complex string example
var filterstring="(|(^(categoryid:eq:1,2,3,4)(categoryname:eq:condiments))(description:lk:”*and*”))";
I'd like to be able to parse this out in to parts but also allow it to be recursive.. I'd like to get the out put looking like:
item[0] (^(categoryid:eq:1,2,3,4)(categoryname:eq:condiments)
item[1] description:lk:”*and*”
From there I could Strip down the item[0] part to get
categoryid:eq:1,2,3,4
categoryname:eq:condiments
At the minute I'm using RegEx and strings to find the | ^ for knowing if it's an AND or an OR the RegEx matches brackets and works well for a single item it's when we nest the values that I'm struggling.
the Regex looks like
#"\((.*?)\)"
I need some way of using Regex to match the nested brackets and help would be appreciated.
You could transform the string into valid XML (just some simple replace, no validation):
var output = filterstring
.Replace("(","<node>")
.Replace(")","</node>")
.Replace("|","<andNode/>")
.Replace("^","<orNode/>");
Then, you could parse the XML nodes by using, for example, System.Xml.Linq.
XDocument doc = XDocument.Parse(output);
Based on you comment, here's how you rearrange the XML in order to get the wrapping you need:
foreach (var item in doc.Root.Descendants())
{
if (item.Name == "orNode" || item.Name == "andNode")
{
item.ElementsAfterSelf()
.ToList()
.ForEach(x =>
{
x.Remove();
item.Add(x);
});
}
}
Here's the resulting XML content:
<node>
<andNode>
<node>
<orNode>
<node>categoryid:eq:1,2,3,4</node>
<node>categoryname:eq:condiments</node>
</orNode>
</node>
<node>description:lk:”*and*”</node>
</andNode>
</node>
I understand that you want the values specified in the filterstring.
My solution would be something like this:
NameValueCollection values = new NameValueCollection();
foreach(Match pair in Regex.Matches(#"\((?<name>\w+):(?<operation>\w+):(?<value>[^)]*)\)"))
{
if (pair.Groups["operation"].Value == "eq")
values.Add(pair.Groups["name"].Value, pair.Groups["value"].Value);
}
The Regex understand a (name:operation:value), it doesn't care about all the other stuff.
After this code has run you can get the values like this:
values["categoryid"]
values["categoryname"]
values["description"]
I hope this will help you in your quest.
I think you should just make a proper parser for that — it would actually end up simpler, more extensible and save you time and headaches in the future. You can use any existing parser generator such as Irony or ANTLR.
I want to extract last character of a string. In fact I should make clear with example. Following is the string from which i want to extract:
<spara h-align="right" bgcolor="none" type="verse" id="1" pnum="1">
<line>
<emphasis type="italic">Approaches to Teaching and Learning</emphasis>
</line>
</spara>
In the above string i want to insert space between the word "Learning" and "</emphasis>" if there is no space present.
Thanks,
Have a look at some of the Linq to XML examples on here instead of using Regex.
With Linq to XML you can do it as follows:
XDocument doc = XDocument.Load("xmlfilename");
foreach (var emphasis in doc.Descendants("emphasis"))
{
if (emphasis.Value.Last() != ' ')
emphasis.Value += " ";
}
doc.Save("outputfilename");
Instead of files you may use streams, readers etc in the Load
Something like the following perhaps?
Regex.Replace(yourString, #"(>[^<]+[^ ])<", #"$1 <");
The solution assumes a sentence is between > and < and is one or more characters long.
Is the sentence really inside XML, or have you extracted it using any of the many XML or DOM methods? For instance, using this:
foreach(node in YourDOM.SelectNodes("//emphasis[#type='italic']"))
{
string yourString = node.FirstChild.Value;
}
If so, if the string is on its own, you can do this instead, which is way simpler and safer:
Regex.Replace(yourString, "([^ ])$", "$1 ");
EDIT: I originally missed if there's no space present, the post above is edited with this information
I am using Linq To XML to create XML that is sent to a third party. I am having difficulty understanding how to create the XML using Linq when part of information I want to send in the XML will be dynamic.
The dynamic part of the XML is held as a string[,] array. This multi dimensional array holds 2 values.
I can 'build' the dynamic XML up using a stringbuilder and store the values that were in the array into a string variable but when I try to include this variable into Linq the variable is HTMLEncoded rather than included as proper XML.
How would I go about adding in my dynamically built string to the XML being built up by Linq?
For Example:
//string below contains values passed into my class
string[,] AccessoriesSelected;
//I loop through the above array and build up my 'Tag' and store in string called AccessoriesXML
//simple linq to xml example with my AccessoriesXML value passed into it
XDocument RequestDoc = new XDocument(
new XElement("MainTag",
new XAttribute("Innervalue", "2")
),
AccessoriesXML);
'Tag' is an optional extra, it might appear in my XML multiple times or it might not - it's dependant on a user checking some checkboxes.
Right now when I run my code I see this:
<MainTag> blah blah </MainTag>
< ;Tag> ;< ;InnerTag> ; option1="valuefromarray0" option2="valuefromarray1" /> ;< ;Tag/> ;
I want to return something this:
<MainTag> blah blah </MainTag>
<Tag><InnerTag option1="valuefromarray0" option2="valuefromarray1" /></Tag>
<Tag><InnerTag option1="valuefromarray0" option2="valuefromarray1" /></Tag>
Any thoughts or suggestions? I can get this working using XmlDocument but I would like to get this working with Linq if it is possible.
Thanks for your help,
Rich
Building XElements with the ("name", "value") constructor will use the value text as literal text - and escape it if necessary to achieve that.
If you want to create the XElement programatically from a snippet of XML text that you want to actually be interpreted as XML, you should use XElement.Load(). This will parse the string as actual XML, instead of trying to assign the text of the string as an escaped literal value.
Try this:
XDocument RequestDoc = new XDocument(
new XElement("MainTag",
new XAttribute("Innervalue", "2")
),
XElement.Load(new StringReader(AccessoriesXML)));