Using Regexp to get information in a KeyValuePair - c#

Help me to parse this message:
text=&direction=re&orfo=rus&files_id=&message=48l16qL2&old_charset=utf-8&template_id=&HTMLMessage=1&draft_msg=&re_msg=&fwd_msg=&RealName=0&To=john+%3Cjohn11%40gmail.com%3E&CC=&BCC=&Subject=TestSubject&Body=%3Cp%3EHello+%D0%9F%D1%80%D0%B8%D0%B2%D0%B5%D1%82+%D1%82%D0%B5%D0%BA%D1%81%D1%82%3Cbr%3E%3Cbr%3E%3C%2Fp%3E&secur
I would like to get information in an KeyValuePair:
Key - Value
text -
direction - re
and so on.
And how to convert this: Hello+%D0%9F%D1%80%D0%B8%D0%B2%D0%B5%D1%82+%D1%82%D0%B5%D0%BA%D1%81%...
there are cyrillic character.
Thanks.

If you want to use a Regex, you can do it like this:
// I only added the first 3 keys, but the others are basically the same
Regex r = new Regex(#"text=(?<text>.*)&direction=(?<direction>.*)&orfo=(?<orfo>.*)");
Match m = r.Match(inputText);
if(m.Success)
{
var text = m.Groups["text"].Value; // result is ""
var direction = m.Groups["direction"].Value; // re
var orfo = m.Groups["orfo"].Value;
}
However, the method suggested by BoltClock is much better:
System.Collections.Specialized.NameValueCollection collection =
System.Web.HttpUtility.ParseQueryString(inputString);

It looks like you are dealing with a URI, better to use the proper class than try and figure out the detailed processing.
http://msdn.microsoft.com/en-us/library/system.uri.aspx

Related

Regular Expression C#, HTML parse

Please help.
I have a text from html, and I need to parse it.
Text:
converter.rates =
{"3":{"USD":{"buy":27.950001,"sell":28.190001},"EUR":{"buy":32.049999,"sell":32.689999}},"8":{"RUB":{"buy":0.27,"sell":0.43},"USD":{"buy":27.799999,"sell":28.200001},"EUR":{"buy":31.700001,"sell":32.549999}},"41":{"USD":{"buy":28.0,"sell":28.200001},"EUR":{"buy":31.950001,"sell":32.650002}},"46":{"RUB":{"buy":0.413,"sell":0.443},"USD":{"buy":28.0,"sell":28.25},"EUR":{"buy":31.73,"sell":32.73}},"47":{"RUB":{"buy":0.41,"sell":0.448},"USD":{"buy":27.98,"sell":28.15},"EUR":{"buy":31.889999,"sell":32.540001}},"48":{"RUB":{"buy":0.4,"sell":0.43},"USD":{"buy":28.0,"sell":28.200001},"EUR":{"buy":32.099998,"sell":32.490002}},"52":{"RUB":{"buy":0.41,"sell":0.43},"USD":{"buy":27.950001,"sell":28.25},"EUR":{"buy":32.0,"sell":32.5}},"77":{"RUB":{"buy":0.38,"sell":0.43},"USD":{"buy":28.049999,"sell":28.200001},"EUR":{"buy":32.049999,"sell":32.5}},"79":{"RUB":{"buy":0.412,"sell":0.444},"USD":{"buy":27.950001,"sell":28.799999},"EUR":{"buy":31.959999,"sell":33.099998}},"80":{"RUB":{"buy":0.38,"sell":0.43},"USD":{"buy":28.030001,"sell":28.190001},"EUR":{"buy":32.0,"sell":32.450001}},"70":{"RUB":{"buy":0.39,"sell":0.42},"USD":{"buy":28.0,"sell":28.25},"EUR":{"buy":32.0,"sell":32.200001}},"1":{"RUB":{"buy":0.42658,"sell":0.42658},"USD":{"buy":28.036648,"sell":28.036648},"EUR":{"buy":32.256161,"sell":32.256161}},"4":{"RUB":{"buy":0.42,"sell":0.43},"USD":{"buy":27.950001,"sell":28.25},"EUR":{"buy":32.150002,"sell":32.599998}},"10":{"RUB":{"buy":0.414,"sell":0.435},"USD":{"buy":28.0,"sell":28.200001},"EUR":{"buy":32.0,"sell":32.599998}},"13":{"RUB":{"buy":0.275,"sell":0.46},"USD":{"buy":27.9,"sell":28.200001},"EUR":{"buy":31.67,"sell":32.599998}},"15":{"RUB":{"buy":0.3749,"sell":0.4395},"USD":{"buy":27.985001,"sell":28.2075},"EUR":{"buy":32.036366,"sell":32.529091}},"31":{"RUB":{"buy":0.275,"sell":0.42},"USD":{"buy":27.9,"sell":28.139999},"EUR":{"buy":31.799999,"sell":32.400002}},"32":{"RUB":{"buy":0.42,"sell":0.5},"USD":{"buy":28.07,"sell":28.299999},"EUR":{"buy":32.150002,"sell":32.599998}},"39":{"USD":{"buy":28.07,"sell":28.25},"EUR":{"buy":32.150002,"sell":32.549999}},"40":{"RUB":{"buy":0.41,"sell":0.43},"USD":{"buy":27.950001,"sell":28.139999},"EUR":{"buy":32.049999,"sell":32.400002}},"64":{"RUB":{"buy":0.4,"sell":0.425},"USD":{"buy":27.9,"sell":28.200001},"EUR":{"buy":32.099998,"sell":32.599998}},"73":{"RUB":{"buy":0.4,"sell":0.43},"USD":{"buy":28.0,"sell":28.299999},"EUR":{"buy":32.0,"sell":32.549999}},"74":{"RUB":{"buy":0.41,"sell":0.435},"USD":{"buy":28.049999,"sell":28.25},"EUR":{"buy":31.799999,"sell":32.5}},"85":{"RUB":{"buy":0.3,"sell":0.43},"USD":{"buy":28.0,"sell":28.200001},"EUR":{"buy":32.099998,"sell":32.52}},"86":{"RUB":{"buy":0.37,"sell":0.42},"USD":{"buy":28.0,"sell":28.200001},"EUR":{"buy":32.0,"sell":32.799999}},"88":{"RUB":{"buy":0.35,"sell":0.5},"USD":{"buy":28.0,"sell":28.15},"EUR":{"buy":32.099998,"sell":32.450001}},"90":{"RUB":{"buy":4.0,"sell":4.4},"USD":{"buy":28.0,"sell":28.15},"EUR":{"buy":31.950001,"sell":32.450001}}}
I need next info from it:
code of bank - "3"
and USD rate - 27.950001, 28.190001
My expression:
#"(\d+)":..USD....\w+..(\d+.\d+)........(\d+.\d+)"
But it didn't work, because the USD does not always go first after the bank code
This is a JSON document. JSON is a recursive format, and regular expressions are notoriously hard to use when parsing recursive data.
Please use a specified parser, like NewtonSoft JSON:
var rawData = #"converter.rates = { ... }"; // original string
var rawJson = rawData.Substring("converter.rates = ".Length); // remove the prefix
var json = JObject.Parse(rawJson); // convert to a JSON data structure
Then you can use it like a dictionary:
foreach(var codeEntry in json)
{
foreach(var currencyEntry in codeEntry.Value)
{
var code = codeEntry.Key;
var currency = currencyEntry.Key;
var buy = currencyEntry.Value["buy"].Value<double>();
var sell = currencyEntry.Value["buy"].Value<double>();
Console.WriteLine($"code of bank - {code} and {currency} rate - {buy}, {sell} ");
}
}
If you still want to use regex, this can do it:
#"""(?<code>\d+)"":\{.*?(?<=""USD""):\{""buy"":(?<buy>\d+\.\d+),""sell"":(?<sell>\d+.\d+)\}"
It is build from your example. Basically it creates three named Groups 'code', 'buy' and 'sell'. Other than that it matches literal characters, only using a look behind '(?<=""USD"")' to find 'USD' to get the wanted rates.
Edit:
If you have a html document and want to grap the 'converter.rates' var as text, you can use this regex:
#"converter.rates\s?=.*\}\}\}"
It looks for the 3 '}' ending the string.

Parse Line and Break it into Variables

I have a text file that contain only the FULL version number of an application that I need to extract and then parse it into separate Variables.
For example lets say the version.cs contains 19.1.354.6
Code I'm using does not seem to be working:
char[] delimiter = { '.' };
string currentVersion = System.IO.File.ReadAllText(#"C:\Applicaion\version.cs");
string[] partsVersion;
partsVersion = currentVersion.Split(delimiter);
string majorVersion = partsVersion[0];
string minorVersion = partsVersion[1];
string buildVersion = partsVersion[2];
string revisVersion = partsVersion[3];
Altough your problem is with the file, most likely it contains other text than a version, why dont you use Version class which is absolutely for this kind of tasks.
var version = new Version("19.1.354.6");
var major = version.Major; // etc..
What you have works fine with the correct input, so I would suggest making sure there is nothing else in the file you're reading.
In the future, please provide error information, since we can't usually tell exactly what you expect to happen, only what we know should happen.
In light of that, I would also suggest looking into using Regex for parsing in the future. In my opinion, it provides a much more flexible solution for your needs. Here's an example of regex to use:
var regex = new Regex(#"([0-9]+)\.([0-9]+)\.([0-9]+)\.([0-9])");
var match = regex.Match("19.1.354.6");
if (match.Success)
{
Console.WriteLine("Match[1]: "+match.Groups[1].Value);
Console.WriteLine("Match[2]: "+match.Groups[2].Value);
Console.WriteLine("Match[3]: "+match.Groups[3].Value);
Console.WriteLine("Match[4]: "+match.Groups[4].Value);
}
else
{
Console.WriteLine("No match found");
}
which outputs the following:
// Match[1]: 19
// Match[2]: 1
// Match[3]: 354
// Match[4]: 6

C# - Searching strings

I can't seem to find a good solution to this issue. I've got an array of strings that are fed in from a report that I recieve about lost or stolen equipment. I've been using the string.IndexOf function through the rest of the form and it works quite well. This issue is with the field that says if the device was lost or stolen.
Example:
"Lost or Stolen? Lost"
"Lost or Stolen? Stolen"
I need to be able to read this but when I do string.IndexOf(#"Lost") it will always return lost because it's in the question.
Unfortunately I'm not able to change the form itself in any way and due to the nature of how it's submited I can't just write code the knocks the first 15 or so characters off the string because that may be too few in some cases.
I would really like something in C# that would allow me to continue to search a string after the first result is found so that the logic would look like:
string my_string = "Lost or Stolen? Stolen";
searchFor(#"Stolen" in my_string)
{
Found Stolen;
Does it have "or " infront of it? yes;
ignore and keep searching;
Found Stolen again;
return "Equipment stolen";
}
Couple of options here. You could look for the last index of a space and take the rest of the string:
string input = "Lost or Stolen? Stolen";
int lastSpaceIndex = input.LastIndexOf(' ');
string result = input.Substring(lastSpaceIndex + 1);
Console.WriteLine(result);
Or you could split it and take the last word:
string input = "Lost or Stolen? Lost";
string result = input.Split(' ').Last();
Console.WriteLine(result);
Regex is also an option, but overkill given the simpler solutions above. A nice shortcut that fits this scenario is to use the RegexOptions.RightToLeft option to get the first match starting from the right:
string result = Regex.Match(input, #"\w+", RegexOptions.RightToLeft).Value;
If I understand your requirement, you're looking for an instance of Lost or Stolen after a ?:
var q = myString.IndexOf("?");
var lost = q >= 0 && myString.IndexOf("Lost", q) > 0;
var stolen = q >= 0 && myString.IndexOf("Stolen", q) > 0;
// or
var lost = myString.LastIndexOf("Lost") > myString.IndexOf("?");
var stolen = myString.LastIndexOf("Stolen") > myString.IndexOf("?");
// don't forget
var neither = !lost && !stolen;
You can look for the string 'Lost' and if it occurs twice, then you can confirm it is 'Lost'.
Its possible in this case that you could use index of on a substring knowing that it is always going to say lost or stolen first
so you parse out the lost or stolen, then like for you keyword to match the remaining string.
something like:
int questionIndex = inputValue.indexOf("?");
string toMatch = inputValue.Substring(questionIndex);
if(toMatch == "Lost")
If it works for your use case, it might be easier to use .EndsWith().
bool lost = my_string.EndsWith("Lost");

Need help about Regular Expression Syntax

I tried to take only this part(after the "j&q") from link
(http://www.google.com/aclk?sa=lai=CEvAD5thCTfHPCIq5gwe2lOWKD6n_uOIB4bzDkxm8uIhRCAAQASDrxZ0GKANQgI6s1ANgybblirSk2A-gAYem9NwDyAEBqQLN5n97JxulPqoEGk_QITE_eyPbZTKIyNFl8dQhptl05oxQ2fHjgAWQTg&sig=AGiWqtwLGY6f1Gnci0e0ojoRsLBxr9joLg&adurl=http://www.mediterraholidays.com/egypt/cairo-and-nile-cruise&rct=j&q=egpyt%20package%20trips).
I used ^.*q=.*$ but with this. I need only after the j&q part if it has.
Why don't you use System.Uri class for this:
Uri url = new Uri("http://www.google.com/aclk?sa=lai=CEvAD5thCTfHPCIq5gwe2lOWKD6n_uOIB4bzDkxm8uIhRCAAQASDrxZ0GKANQgI6s1ANgybblirSk2A-gAYem9NwDyAEBqQLN5n97JxulPqoEGk_QITE_eyPbZTKIyNFl8dQhptl05oxQ2fHjgAWQTg&sig=AGiWqtwLGY6f1Gnci0e0ojoRsLBxr9joLg&adurl=http://www.mediterraholidays.com/egypt/cairo-and-nile-cruise&rct=j&q=egpyt%20package%20trips");
var queryString = HttpUtility.ParseQueryString(url.Query);
var q = queryString["q"];
The q variable holds the value: egpyt package trips
&q=(?<data>[^&]*)
The answer needs to be at least 30 chars, so I add some joke:
“Knock, knock.”
“Who’s there?”
very long pause….
“Java.”

cutting from string in C#

My strings look like that: aaa/b/cc/dd/ee . I want to cut first part without a / . How can i do it? I have many strings and they don't have the same length. I tried to use Substring(), but what about / ?
I want to add 'aaa' to the first treeNode, 'b' to the second etc. I know how to add something to treeview, but i don't know how can i receive this parts.
Maybe the Split() method is what you're after?
string value = "aaa/b/cc/dd/ee";
string[] collection = value.Split('/');
Identifies the substrings in this instance that are delimited by one or more characters specified in an array, then places the substrings into a String array.
Based on your updates related to a TreeView (ASP.Net? WinForms?) you can do this:
foreach(string text in collection)
{
TreeNode node = new TreeNode(text);
myTreeView.Nodes.Add(node);
}
Use Substring and IndexOf to find the location of the first /
To get the first part:
// from memory, need to test :)
string output = String.Substring(inputString, 0, inputString.IndexOf("/"));
To just cut the first part:
// from memory, need to test :)
string output = String.Substring(inputString,
inputString.IndexOf("/"),
inputString.Length - inputString.IndexOf("/");
You would probably want to do:
string[] parts = "aaa/b/cc/dd/ee".Split(new char[] { '/' });
Sounds like this is a job for... Regular Expressions!
One way to do it is by using string.Split to split your string into an array, and then string.Join to make whatever parts of the array you want into a new string.
For example:
var parts = input.Split('/');
var processedInput = string.Join("/", parts.Skip(1));
This is a general approach. If you only need to do very specific processing, you can be more efficient with string.IndexOf, for example:
var processedInput = input.Substring(input.IndexOf('/') + 1);

Categories

Resources