Parsing GUID from string line - c#

I have different option how my GUIDS could be stored as a string line.
1. Accessibility|5102d73a-1b0b-4461-93cd-0c024738c19e
2. 5102d73a-1b0b-4461-93cd-0c024738c19e;#5102d73a-1b0b-4461-93cd-0c024733d52d
3. |;#5102d73a-1b0b-4461-93cd-0c024738c19e;#SharePointTag|5102d73a-1b0b-4461-93cd-0c024733d52d
3. Business pages|;#5102d73a-1b0b-4461-93cd-0c024738cz13;#SharePointTag|5102d73a-1b0b-4461-93cd-0c024733d52d
Could you guys help me to with ideas how could I parse this tags and get List of Guids type in the end? Maybe regular expression could help in such situation?

Looks like you're playing with Managed metadata, term store ID and term set ID :)
Just use a regular regexp (the "p" variable below):
string c1 = "Accessibility|5102d73a-1b0b-4461-93cd-0c024738c19e";
string c2 = "5102d73a-1b0b-4461-93cd-0c024738c19e;#5102d73a-1b0b-4461-93cd-0c024733d52d";
string c3 = "|;#5102d73a-1b0b-4461-93cd-0c024738c19e;#SharePointTag|5102d73a-1b0b-4461-93cd-0c024733d52d";
string c4 = "Business pages|;#5102d73a-1b0b-4461-93cd-0c024738cz13;#SharePointTag|5102d73a-1b0b-4461-93cd-0c024733d52d";
string p = #"([a-zA-Z0-9]{8}[-][a-zA-Z0-9]{4}[-][a-zA-Z0-9]{4}[-][a-zA-Z0-9]{4}[-][a-zA-Z0-9]{12})";
MatchCollection mc;
Console.WriteLine("#1");
mc = Regex.Matches(c1, p);
foreach (var id in mc)
Console.WriteLine(id);
Console.WriteLine("#2");
mc = Regex.Matches(c2, p);
foreach (var id in mc)
Console.WriteLine(id);
Console.WriteLine("#3");
mc = Regex.Matches(c3, p);
foreach (var id in mc)
Console.WriteLine(id);
Console.WriteLine("#4");
mc = Regex.Matches(c4, p);
foreach (var id in mc)
Console.WriteLine(id);
Wich output:
#1
5102d73a-1b0b-4461-93cd-0c024738c19e
#2
5102d73a-1b0b-4461-93cd-0c024738c19e
5102d73a-1b0b-4461-93cd-0c024733d52d
#3
5102d73a-1b0b-4461-93cd-0c024738c19e
5102d73a-1b0b-4461-93cd-0c024733d52d
#4
5102d73a-1b0b-4461-93cd-0c024738cz13
5102d73a-1b0b-4461-93cd-0c024733d52d
Press any key to continue...

var possibleGuids = myString.Split("|;#".ToCharArray(),
StringSplitOptions.RemoveEmptyEntries);
Guid g;
foreach(var poss in possibleGuids)
{
if(Guid.TryParse(poss, out g))
{
// g contains a guid!
}
}

string sContent = "your data"; // any of your four forms of input
string sPattern = #"([a-z0-9]*[-]){4}[a-z0-9]*";
MatchCollection mc = Regex.Matches(sContent, sPattern );
foreach (var sGUID in mc)
{
// Do whatever with sGUID
}

You can split string for example
"fist|second".Split('|')
You once you get the string of GUID convert it to GUID using
Guid = new Guid(myString);
For first line
var guid = new Guid("Accessibility|5102d73a-1b0b-4461-93cd-0c024738c19e".Split("|")[1]);
For second line
var myArray = "5102d73a-1b0b-4461-93cd-0c024738c19e;#5102d73a-1b0b-4461-93cd-0c024733d52d".Split(';');
var guid1 = new Guid(myArray[0]);
var guid2 = new Guid(myArray[1].Replace('#',''));
So you can go ahead like that..

Related

adding 2 strings from 2 foreach statements into a list

I am trying to use 2 regex patterns to extract specific data from this pdf
public static void ReadPDF()
{
using (PdfReader reader = new PdfReader(#"\\cytgit\Applications\C#\EZDock\CEVA.pdf"))
{
for (int i = 1; i <= reader.NumberOfPages; i++)
{
string text = PdfTextExtractor.GetTextFromPage(reader, i);
string pattern2 = #"^\W*([\w-]+.*\n{1})Route Name:";
Regex r2 = new Regex(pattern2, RegexOptions.Multiline);
foreach (Match m in r2.Matches(text))
{
Debug.Print((m.Value.Substring(0, 13)));
}
string pattern = #"(?<=.*Initial Arrival.*(\n)).*?(?=(\r?\n)|$)";
Regex r = new Regex(pattern, RegexOptions.Multiline);
foreach (Match m in r.Matches(text))
{
List<string> stringList = m.Value.Split(' ').ToList();
Routes.Add(new Routes { CarrierArrival = DateTime.Parse(stringList[0], System.Globalization.CultureInfo.InvariantCulture), CarrierDeparture = DateTime.Parse(stringList[1], System.Globalization.CultureInfo.InvariantCulture), PlantDestination = stringList[2], DockCode = stringList[3], InitialDest = stringList[4], InitialArrival = stringList[5], FinalLocation = stringList[6], Transit = stringList[7], PickupFreq = stringList[8], DeliveryFreq = stringList[9]});
}
}
}
}
So it prints the right data on the first foreach and the second foreach works great as well. My goal here is to put what is being printed in the first foreach into the same list as what is added in the second foreach stringList that way I can add m.Value.Substring(0, 13) into stringList to be able to add it to the new Route
Create the list before the first foreach?:
List<string> stringList = new List<string>();
foreach (Match m in r2.Matches(text))
{
stringList.Add(m.Value.Substring(0, 13));
}
string pattern = #"(?<=.*Initial Arrival.*(\n)).*?(?=(\r?\n)|$)";
Regex r = new Regex(pattern, RegexOptions.Multiline);
foreach (Match m in r.Matches(text))
{
stringList.AddRange(m.Value.Split(' '));
Routes.Add(...);
}
I guess one page of the pdf has multiple routes (name and details). The problem I see is matching the names with the corresponding information.
Foreach page I would try to split those route sections to get a list of sections. With an additional foreach loop extract the single name of the route with pattern2 and the details with pattern
for (int i = 1; i <= reader.NumberOfPages; i++)
{
string text = PdfTextExtractor.GetTextFromPage(reader, i);
string routeSections[] = SplitPageInRouteSections(text);
foreach(var routeSection in routeSections)
{
string routeName = Regex.Match(routeSection, pattern2).ToString()
string[] details = Regex.Match(routeSection, pattern).ToString().Split(' ');
Routes.Add(new Routes{ RouteName = routeName, CarrierArrival = details[0], ...})
}
}

How to split and take multiple strings from a url in c#?

I have a string looking something like this:
/Gender=&Age=&Query=&Orgrimmar+l%C3%A4n=01&Stormwind+l%C3%A4n=07&Undercity+l%C3%A4n=09&Pag
I want a list of string with "Orgrimmar", "Stormwind" and "Undercity". How is this possible so that it splits AFTER Query and between & and + in order so that we avoid getting a string like this "Orgrimmar+l%C3%A4n=01&Stormwind".
Let us assume that we don't know the name of the strings.. :)
Updated, i still don't seem to get it to work. I have added a list of counties that i can use to validate this. However i still find it hard in this case. countyList is used to validate that the counties/cities in the url matches a pre-existing Collection.
var countyQuery = Request.Url.Query;
var counties = this._locationService.GetAllCounties();
List<string> countyList = new List<string>();
List<string> selectedCountiesList = new List<string>();
foreach (var i in counties)
{
countyList.Add(i.Name);
}
Regex r = new Regex(#"&(.+?)\+");
MatchCollection mc = r.Matches(countyQuery);
foreach (Match curMatch in mc)
{
if (countyList.Contains(curMatch.Groups[1].Value))
{
selectedCountiesList.Add(curMatch.Groups[1].Value);
}
}
return selectedCountiesList;
Changed url to be/?Gender=&Age=&Query=&county=13&county=08&county=01&Page=1
where 13, 08, 01 and so on is Id of the counties
The final solution was:
var selectedCountyQuery = Request.QueryString
//CountySearch = "county"
[QueryStringParameters.CountySearch];
List countyList = new List();
List<string> selectedCounties = new List<string>();
if (!string.IsNullOrEmpty(selectedCountyQuery))
{
var selectedCountiesArray = selectedCountyQuery.Split(new[]{ ',' });
foreach (var selectedCounty in selectedCountiesArray)
{
selectedCounties.Add(selectedCounty);
}
}
return selectedCounties;
You can get all parameter and value with Substring() and Split() method.
Example :
var URL = "controller/method?var1=&var2=&var3=dsgdf";
var ParameterPart = URL.Split("?")[1];
var ParametersArray = ParameterPart.Split("&");
//output : ["var1=","var2=","var3=dsgdf"];
foreach(var Parameter in ParametersArray)
{
var ParameterName= Parameter.Split("=")[0];
var ParameterValue= Parameter.Split("=")[1];
}
You can use a regex and extract the matches:
Regex r = new Regex(#"&(.+?)\+");
MatchCollection mc = r.Matches(s);
Then you can itterate your desired strings (in this case wow cities) like:
foreach(Match curMatch in mc)
{
Console.WriteLine(curMatch.Groups[1].Value);
}
string[] numbers ={ "/Gender=&Age=&Query=&Orgrimmar+l%C3%A4n=01&Stormwind+l%C3%A4n=07&Undercity+l%C3%A4n=09&Pag"};
string sPattern = #"(?<=&Orgrimmar)+";
foreach (string s in numbers){
if (System.Text.RegularExpressions.Regex.IsMatch(s, sPattern)){
System.Console.WriteLine(" - valid");}
else{System.Console.WriteLine(" - invalid");}
Output: valid
string[] numbers ={ "/Gender=&Age=&Query=Orgrimmar+l%C3%A4n=01&Stormwind+l%C3%A4n=07&Undercity+l%C3%A4n=09&Pag"};
Output: invalid
Further to check two parameters:
string[] numbers ={ "/Gender=&Age=&Query=&Orgrimmar+l%C3%A4n=01&Stormwind+l%C3%A4n=07&Undercity+l%C3%A4n=09&Pag"};
string sPattern = #"(?<=&Orgrimmar)+";
string sPattern2 = #"(?<=&Stormwind)+";
foreach (string s in numbers){
if (System.Text.RegularExpressions.Regex.IsMatch(s, sPattern) && System.Text.RegularExpressions.Regex.IsMatch(s, sPattern2))
...

Searching the first few characters of every word within a string in C#

I am new to programming languages. I have a requirement where I have to return a record based on a search string.
For example, take the following three records and a search string of "Cal":
University of California
Pascal Institute
California University
I've tried String.Contains, but all three are returned. If I use String.StartsWith, I get only record #3. My requirement is to return #1 and #3 in the result.
Thank you for your help.
If you're using .NET 3.5 or higher, I'd recommend using the LINQ extension methods. Check out String.Split and Enumerable.Any. Something like:
string myString = "University of California";
bool included = myString.Split(' ').Any(w => w.StartsWith("Cal"));
Split divides myString at the space characters and returns an array of strings. Any works on the array, returning true if any of the strings starts with "Cal".
If you don't want to or can't use Any, then you'll have to manually loop through the words.
string myString = "University of California";
bool included = false;
foreach (string word in myString.Split(' '))
{
if (word.StartsWith("Cal"))
{
included = true;
break;
}
}
I like this for simplicity:
if(str.StartsWith("Cal") || str.Contains(" Cal")){
//do something
}
You can try:
foreach(var str in stringInQuestion.Split(' '))
{
if(str.StartsWith("Cal"))
{
//do something
}
}
You can use Regular expressions to find the matches. Here is an example
//array of strings to check
String[] strs = {"University of California", "Pascal Institute", "California University"};
//create the regular expression to look for
Regex regex = new Regex(#"Cal\w*");
//create a list to hold the matches
List<String> myMatches = new List<String>();
//loop through the strings
foreach (String s in strs)
{ //check for a match
if (regex.Match(s).Success)
{ //add to the list
myMatches.Add(s);
}
}
//loop through the list and present the matches one at a time in a message box
foreach (String matchItem in myMatches)
{
MessageBox.Show(matchItem + " was a match");
}
string univOfCal = "University of California";
string pascalInst = "Pascal Institute";
string calUniv = "California University";
string[] arrayofStrings = new string[]
{
univOfCal, pascalInst, calUniv
};
string wordToMatch = "Cal";
foreach (string i in arrayofStrings)
{
if (i.Contains(wordToMatch)){
Console.Write(i + "\n");
}
}
Console.ReadLine();
}
var strings = new List<string> { "University of California", "Pascal Institute", "California University" };
var matches = strings.Where(s => s.Split(' ').Any(x => x.StartsWith("Cal")));
foreach (var match in matches)
{
Console.WriteLine(match);
}
Output:
University of California
California University
This is actually a good use case for regular expressions.
string[] words =
{
"University of California",
"Pascal Institute",
"California University"
}
var expr = #"\bcal";
var opts = RegexOptions.IgnoreCase;
var matches = words.Where(x =>
Regex.IsMatch(x, expr, opts)).ToArray();
The "\b" matches any word boundary (punctuation, space, etc...).

Looping through Regex Matches

This is my source string:
<box><3>
<table><1>
<chair><8>
This is my Regex Patern:
<(?<item>\w+?)><(?<count>\d+?)>
This is my Item class
class Item
{
string Name;
int count;
//(...)
}
This is my Item Collection;
List<Item> OrderList = new List(Item);
I want to populate that list with Item's based on source string.
This is my function. It's not working.
Regex ItemRegex = new Regex(#"<(?<item>\w+?)><(?<count>\d+?)>", RegexOptions.Compiled);
foreach (Match ItemMatch in ItemRegex.Matches(sourceString))
{
Item temp = new Item(ItemMatch.Groups["item"].ToString(), int.Parse(ItemMatch.Groups["count"].ToString()));
OrderList.Add(temp);
}
Threre might be some small mistakes like missing letter it this example because this is easier version of what I have in my app.
The problem is that In the end I have only one Item in OrderList.
UPDATE
I got it working.
Thans for help.
class Program
{
static void Main(string[] args)
{
string sourceString = #"<box><3>
<table><1>
<chair><8>";
Regex ItemRegex = new Regex(#"<(?<item>\w+?)><(?<count>\d+?)>", RegexOptions.Compiled);
foreach (Match ItemMatch in ItemRegex.Matches(sourceString))
{
Console.WriteLine(ItemMatch);
}
Console.ReadLine();
}
}
Returns 3 matches for me. Your problem must be elsewhere.
For future reference I want to document the above code converted to using a declarative approach as a LinqPad code snippet:
var sourceString = #"<box><3>
<table><1>
<chair><8>";
var count = 0;
var ItemRegex = new Regex(#"<(?<item>[^>]+)><(?<count>[^>]*)>", RegexOptions.Compiled);
var OrderList = ItemRegex.Matches(sourceString)
.Cast<Match>()
.Select(m => new
{
Name = m.Groups["item"].ToString(),
Count = int.TryParse(m.Groups["count"].ToString(), out count) ? count : 0,
})
.ToList();
OrderList.Dump();
With output:

How do I get the name of captured groups in a C# Regex?

Is there a way to get the name of a captured group in C#?
string line = "No.123456789 04/09/2009 999";
Regex regex = new Regex(#"(?<number>[\d]{9}) (?<date>[\d]{2}/[\d]{2}/[\d]{4}) (?<code>.*)");
GroupCollection groups = regex.Match(line).Groups;
foreach (Group group in groups)
{
Console.WriteLine("Group: {0}, Value: {1}", ???, group.Value);
}
I want to get this result:
Group: [I donĀ“t know what should go here], Value: 123456789 04/09/2009 999
Group: number, Value: 123456789
Group: date, Value: 04/09/2009
Group: code, Value: 999
Use GetGroupNames to get the list of groups in an expression and then iterate over those, using the names as keys into the groups collection.
For example,
GroupCollection groups = regex.Match(line).Groups;
foreach (string groupName in regex.GetGroupNames())
{
Console.WriteLine(
"Group: {0}, Value: {1}",
groupName,
groups[groupName].Value);
}
The cleanest way to do this is by using this extension method:
public static class MyExtensionMethods
{
public static Dictionary<string, string> MatchNamedCaptures(this Regex regex, string input)
{
var namedCaptureDictionary = new Dictionary<string, string>();
GroupCollection groups = regex.Match(input).Groups;
string [] groupNames = regex.GetGroupNames();
foreach (string groupName in groupNames)
if (groups[groupName].Captures.Count > 0)
namedCaptureDictionary.Add(groupName,groups[groupName].Value);
return namedCaptureDictionary;
}
}
Once this extension method is in place, you can get names and values like this:
var regex = new Regex(#"(?<year>[\d]+)\|(?<month>[\d]+)\|(?<day>[\d]+)");
var namedCaptures = regex.MatchNamedCaptures(wikiDate);
string s = "";
foreach (var item in namedCaptures)
{
s += item.Key + ": " + item.Value + "\r\n";
}
s += namedCaptures["year"];
s += namedCaptures["month"];
s += namedCaptures["day"];
Since .NET 4.7, there is Group.Name property available.
You should use GetGroupNames(); and the code will look something like this:
string line = "No.123456789 04/09/2009 999";
Regex regex =
new Regex(#"(?<number>[\d]{9}) (?<date>[\d]{2}/[\d]{2}/[\d]{4}) (?<code>.*)");
GroupCollection groups = regex.Match(line).Groups;
var grpNames = regex.GetGroupNames();
foreach (var grpName in grpNames)
{
Console.WriteLine("Group: {0}, Value: {1}", grpName, groups[grpName].Value);
}
To update the existing extension method answer by #whitneyland with one that can handle multiple matches:
public static List<Dictionary<string, string>> MatchNamedCaptures(this Regex regex, string input)
{
var namedCaptureList = new List<Dictionary<string, string>>();
var match = regex.Match(input);
do
{
Dictionary<string, string> namedCaptureDictionary = new Dictionary<string, string>();
GroupCollection groups = match.Groups;
string[] groupNames = regex.GetGroupNames();
foreach (string groupName in groupNames)
{
if (groups[groupName].Captures.Count > 0)
namedCaptureDictionary.Add(groupName, groups[groupName].Value);
}
namedCaptureList.Add(namedCaptureDictionary);
match = match.NextMatch();
}
while (match!=null && match.Success);
return namedCaptureList;
}
Usage:
Regex pickoutInfo = new Regex(#"(?<key>[^=;,]+)=(?<val>[^;,]+(,\d+)?)", RegexOptions.ExplicitCapture);
var matches = pickoutInfo.MatchNamedCaptures(_context.Database.GetConnectionString());
string server = matches.Single( a => a["key"]=="Server")["val"];
The Regex class is the key to this!
foreach(Group group in match.Groups)
{
Console.WriteLine("Group: {0}, Value: {1}", regex.GroupNameFromNumber(group.Index), group.Value);
}
http://msdn.microsoft.com/en-us/library/system.text.regularexpressions.regex.groupnamefromnumber.aspx

Categories

Resources