Get values inside quotes from a string - c#

I have a string something like this:
<BU Name="xyz" SerialNo="3838383" impression="jdhfl87lkjh8937ljk" />
I want to extract values like this:
Name = xyz
SerialNo = 3838383
impression = jdhfl87lkjh8937ljk
How to get these values in C#?
I am using C# 3.5.

If by some reason you don't want to use Xml parser you can use reqular expression to achieve this.
Use this regular expression:
(\w)+=\"(\w)+\"
Use this regular expression like this:
var input = #"<BU Name=""xyz"" SerialNo=""3838383"" impression=""jdhfl87lkjh8937ljk"" />";
var pattern = #"(\w)+=\""(\w)+\""";
var result = Regex.Matches(input, pattern);
foreach (var match in result.Cast<Match>())
{
Console.WriteLine(match.Value);
}
Result:
//Name="xyz"
//SerialNo="3838383"
//impression="jdhfl87lkjh8937ljk"
//Press any key to continue.

Related

How can extract string matches using regular expression from a List?

I have a List contain some strings inside like this and other data.
HwndWrapper[App.exe;;cda6c3f4-8c87-4b12-8f3d-5322ca90eeex]
HwndWrapper[App.exe;;cadac3f4-8c87-4b12-8q3d-1qwe2ca90eec]
HwndWrapper[App.exe;;c1b6a3s4-8c87-4b12-8f3d-2qw2ca90eeev]
My list:
// Returns a list of WindowInformation objects with Handle, Caption, Class,
// Parent, Children, Siblings and process information
List<WindowInformation> windowListExtended = WindowList.GetAllWindowsExtendedInfo();
The regular expresion to match is:
HwndWrapper\[App.exe;;.*?\]
Now for every match on the list. I need extract the string matched and run a process with every string extracted, Foreach or something like that.
Some help please.
Update:
Thanks Altaris for the help, just need convert List to string
var message = string.Join(",", windowListExtended);
string pattern = #"HwndWrapper\[LogiOverlay.exe;;.*?]";
MatchCollection matches = Regex.Matches(message, pattern);
From what I understand you want to extract every match in a separate list to work with, there you go:
var someList = new List<string>{"HwndWrapper[App.exe;;cda6c3f4-8c87-4b12-8f3d-5322ca90eeex]",
"HwndWrapper[App.exe;;cadac3f4-8c87-4b12-8q3d-1qwe2ca90eec]",
"HwndWrapper[App.exe;;c1b6a3s4-8c87-4b12-8f3d-2qw2ca90eeev]"};
Regex FindHwndWrapper = new Regex(#"HwndWrapper\[App.exe;;(.*)\]");
var matches = someList.Where(s => FindHwndWrapper.IsMatch(s)).ToList();
foreach(var match in matches)
{
Console.WriteLine(match);// Use values
}
I used System.Linq function Where() to iterate through list
Use this Linq line if you want just the id parts, like "cda6c3f4-8c87-4b12-8f3d-5322ca90eeex"
var matches = someList.Select(s => FindHwndWrapper.Match(s).Groups[1]).ToList();
I am unsure of what you want exactly, I think you want to extract these
List<string> windowListExtended = new List<string>();
windowListExtended.Add("HwndWrapper[App.exe;;cda6c3f4-8c87-4b12-8f3d-5322ca90eeex]");
windowListExtended.Add("HwndWrapper[App.exe;;cadac3f4-8c87-4b12-8q3d-1qwe2ca90eec]");
windowListExtended.Add("HwndWrapper[App.exe;;c1b6a3s4-8c87-4b12-8f3d-2qw2ca90eeev]");
var myRegex = new Regex(#"HwndWrapper\[App.exe;;.*?]");
var resultList = files.Where(x => myRegex.IsMatch(x)).Select(x => x.Split(new[] { ";;","]" }, StringSplitOptions.None)[1]).ToList();
//Now resultList contains => cda6c3f4-8c87-4b12-8f3d-5322ca90eeex, cadac3f4-8c87-4b12-8q3d-1qwe2ca90eec, c1b6a3s4-8c87-4b12-8f3d-2qw2ca90eeev
foreach (var item in resultList)
{
//Do whatever you want
}

Get string value from Placeholder C#

I have pattern string:"Hello {Name}, welcome to {Country}"
and a full value string:"Hello Scott, welcome to VietNam"
How can I extract value of {Name} and {Country}:
Name = Scott, Country = VietNam
I have see some regular expression to resolve this problem but can I apply fuzzy matching here? e.g. With invert string "welcome to VietNam, Hello Scott", we must change the regular expression too?
You can use Regex:
var Matches = Regex.Matches(input, #"hello\s+?([^\s]*)\s*|welcome\s+?to\s+?([^\s]*)", RegexOptions.IgnoreCase);
string Name = Matches.Groups[1].Value;
string Country = Matches.Groups[2].Value;
Update: Changed code to work either way. Demo.
As a more general solution, you can do something like the following:
public Dictionary<string, string> GetMatches(string pattern, string source)
{
var tokens = new List<string>();
var matches = new Dictionary<string, string>();
pattern = Regex.Escape(pattern);
pattern = Regex.Replace(pattern, #"\\{.*?}", (match) =>
{
var name = match.Value.Substring(2, match.Value.Length - 3);
tokens.add(name);
return $"(?<{name}>.*)";
});
var sourceMatches = Regex.Matches(source, pattern);
foreach (var name in tokens)
{
matches[name] = sourceMatches[0].Groups[name].Value;
}
return matches;
}
The method extracts the token names from the pattern, then replaces the tokens with the equivalent syntax for a regular expression named capture group. Next, it uses the modified pattern as a regular expression to extract the values from the source string. Finally, it uses the captured token names with the named capture groups to build a dictionary to be returned.
Just quick and dirty..
string pattern = "Hello Scott, welcome to VietNam";
var splitsArray = pattern.Split(new string[] { " " }, StringSplitOptions.RemoveEmptyEntries);
var Name = splitsArray[1].Replace(",", string.Empty);
var country = splitsArray[4];

c# regex replace but with different replacement value each time

I have a string like this:
<div>
<query>select * from table1</query>
</div>
<div>
<query>select * from table2</query>
</div>
This is a templating usecase. Each query will be replaced by a different value (ie SQL result). Is it possible to use Regex Replace method to do this ?
The solution I'm thinking of is to use Regex.Match in the first pass, collect all the matches and then use string.replace in the second pass to replace the matches one by one. Is there a better way to solve this ?
var source =
#"<div>
<query>select * from table1</query>
</div>
<div>
<query>select * from table2</query>
</div>";
var result = Regex.Replace(
source,
"(?<=<query>).*?(?=</query>)",
match => Sql.Execute(match.Value));
The Sql.Execute is a placeholder function for whatever logic you invoke to execute your query. Upon completion, its results will substitute the original <query>…</query> contents.
If you want the query tags to be eliminated, then use a named capture group rather than lookarounds:
var result = Regex.Replace(
source,
"<query>(?<q>.*?)</query>",
match => Sql.Execute(match.Groups["q"].Value));
You could use Html Agility Pack to get first the query tags and replace the inner text with whatever you want:
var html = new HtmlDocument();
html.Load(filepath);
var queries = html.DocumentNode.SelectNodes("//query");
foreach(var node in queries)
{
if(node.InnerText=="select * from table1")
{
node.InnerText="your result";
}
}
You could also use a dictionary to save the pattern as key and the replacement as value:
var dict = new Dictionary<string, string>();
dict.Add("select * from table1","your result");
//...
var html = new HtmlDocument();
html.Load(filepath);
var queries = html.DocumentNode.SelectNodes("//query");
foreach(var node in queries)
{
if(dict.Keys.Contains(node.InnerText))
{
node.InnerText=dict[node.InnerText];
}
}
We know regex is not good for html parsing, but I think you don't need to parse html here, but simply get what's inside <query>xxx</query> pattern.
So it doesn't matter what is the rest of the document as you don't want to traverse it, nor validate nor change, nothing (according with your question).
So, in this particular case, I would use regex more than html parser:
var pattern = "<query>.+<\/query>";
And then replace every match with string Replace method

Get email address from the url with and without escape character of # in c#

I have a Url as shown below
http://www.mytestsite.com/TesPage.aspx?pageid=32&LangType=1033&emailAddress=myname%40gmail.com
I would like to have the email address from the url
Note
The email address may have the escape character of # sometimes.
ie it may be myname%40gmail.com or myname#gmail.com
Is there any way to get the email address from a url, such that if the that have the matching regx for an email patten and retrieve the value.
Here is the code that i have tried
string theRealURL = "http://www.mytestsite.com/TesPage.aspx?pageid=32&LangType=1033&emailAddress=myname%40gmail.com";
string emailPattern = #"^([\w\.\-]+)(((%40))|(#))([\w\-]+)((\.(\w){2,3})+)$";
Match match = Regex.Match(theRealURL, emailPattern);
if (match.Success)
string campaignEmail = match.Value;
If anyone helps what went wrong here?
Thanks
If possible, don't use a regular expression when there are domain-specific tools available.
Unless there is a reason not to reference System.Web, use
var uri = new Uri(
"http://www.mytestsite.com/TesPage.aspx?pageid=32&LangType=1033&emailAddress=myname%40gmail.com");
var email = HttpUtility.ParseQueryString(uri.Query).Get("emailAddress");
Edit: If (for some reason) you don't know the name of the parameter containing the address, use appropriate tools to get all the query values and then see what looks like an email address.
var emails = query
.AllKeys
.Select(k => query[k])
.Where(v => Regex.IsMatch(v, emailPattern));
If you want to improve your email regex too, there are plenty of answers about that already.
Starting with Rawling's answer in response to the comment
The query string parameter can vary.. How can it possible without using the query string parameter name.
The follwing code will produce a list (emails) of the emails in the supplied input:
var input = "http://www.mytestsite.com/TesPage.aspx?pageid=32&LangType=1033&emailAddress=myname%40gmail.com";
var queryString = new Uri(input).Query;
var parsed = HttpUtility.ParseQueryString(queryString);
var attribute = new EmailAddressAttribute();
var emails = new List<string>();
foreach (var key in parsed.Cast<string>())
{
var value = parsed.Get(key);
if (attribute.IsValid(value))
emails.Add(value);
}
Console.WriteLine(String.Join(", ", emails)); // prints: myname#gmail.com
See also this answer for email parsing technique.

Extract parts of string

My string is like
COMMAND="HELP ME" TIMEOUT_SECONDS="30" APP_ID="SOMETHING RANDOM" COUNT="100" RETVAL="0" STDOUT="DATA I NEED" STDERR="NO ERROR" STATUS="SUCCESS"
I want to be able to extract STDOUT, STDERR and STATUS. How can I do it ?
You can try this regex:
(?<=(?:STDOUT|STDERR|STATUS)\=")([^"]+)
As a result you will get 3 results.
MatchCollection mcol = Regex.Matches(strInput, #“(?<=(?:STDOUT|STDERR|STATUS)\=")([^"]+)”);
foreach(Match m in mcol)
{
System.Diagnostic.Debug.Print(m.ToString());
}
Also:
using System.Text.RegularExpressions;
Live Demo
Here, in this part of regex:
(?:STDOUT|STDERR|STATUS)
You can also specify the key (other than the 3 mentioned) whose value is needed.
string input2 = #"COMMAND=""HELP ME"" TIMEOUT_SECONDS=""30"" APP_ID=""SOMETHING RANDOM"" COUNT=""100"" RETVAL=""0"" STDOUT=""DATA I NEED"" STDERR=""NO ERROR"" STATUS=""SUCCESS""";
var dict = Regex.Matches(input2, #"(.+?)=""(.+?)""").Cast<Match>()
.ToDictionary(m => m.Groups[1].Value.Trim(),
m => m.Groups[2].Value.Trim());
Console.WriteLine(dict["STDOUT"]);
Console.WriteLine(dict["STATUS"]);
Use the following function:
public static Dictionary<string,string> GetValues(string command)
{
Dictionary<string,string> output = new Dictionary<string,string>();
string[] splitCommand = command.Split(" ");
foreach(var item in splitCommand)
{
output.Add(item.Split("=")[0] , item.Split("=")[1]);
}
return output;
}
When you want to get the values use the function like
Dictionary<string,string> output = YourClass.GetValue(command);
string stdout = output["STDOUT"];
string etderr= output["ETDERR"];
string status = output["STATUS"];
I donot have access to compiler. So, there might be an error. But the overall functionality will look something like this.

Categories

Resources