Regular expression for treeview - c#

I've try to parse string like this:
"#1#Process#{some process|info}{some name|some info {child info|child info}}{some name|some info}"
in several messages and create string like this:
#1#Process#
-some process|info
-some name|some info
-child info|child info
-some name|some info
I'm trying to use RegExp and following code:
using System;
using System.Collections;
using System.Text.RegularExpressions;
namespace prRegEXP
{
class Program
{
static String st="";
public static void Main(string[] args)
{
Console.WriteLine("Hello World!");
// TODO: Implement Functionality Here
var pattern = #"\{(.*?)\}";
var query = "#1#Process#{some process|info}{some name|some info {child info|child info}}{some name|some info}";
FindTree (pattern, query);
Console.WriteLine(st);
Console.WriteLine();
Console.WriteLine("Press any key to continue . . . ");
Console.ReadKey(true);
}
private static void FindTree (String pattern, String query) {
var matches = Regex.Matches(query, pattern);
foreach (Match m in matches) {
st += m.Groups[1] + "\n";
if (Regex.IsMatch(m.Groups[1].ToString(), #"\{(.*?)" )) {
FindTree (#"\{(.*?)", m.Groups[1].ToString());
}
}
}
}
}
It's based on example solution I found and I want to create some message tree which take care about messages inside (like child info|child name). And there can be a lot of them.
I cannot figure out how to match child expressions and send it in recursive parameter. Has any idea or fix?

Writing a simple regex to support N-depth recursions would be impossible(?) or at least very difficult.
A much easier solution would be to just go trough the string char by char and insert indentation and newlines when a new message is found.
Something along the lines of this should work:
private static String FindTree(String query)
System.Text.StringBuilder sb = new System.Text.StringBuilder();
String indent = "";
foreach (var ch in query) {
if (ch == '{') {
sb.Append("\n");
sb.Append(indent);
sb.Append("- ");
indent += "\t";
} else if (ch == '}') {
indent = indent.Substring(1);
} else {
sb.Append(ch);
}
}
return sb.ToString();
}
The above code is not tested, nor am I well versed in C# so it might be full of errors, but it should illustrate the basic idea.

Related

What is wrong here in my C# Program?

I want to search a particular word in a defined string for which I am using the foreach key word, but it's not working.
I am just a beginner at this. Please help me what is wrong in this and I don't want to use arrays.
static void Main(string[] args)
{
string str = "Hello You are welcome";
foreach (string item in str) // can we use string here?
{
if (str.Contains(are); // I am checking if the word "are" is present in the above string
Console.WriteLine("True");
)
}
string str = "Hello You are welcome";
if (str.Contains("are"))
{
Console.WriteLine("True");
}
or you mean:
string str = "Hello You are welcome";
foreach (var word in str.Split()) // split the string (by space)
{
if (word == "are")
{
Console.WriteLine("True");
}
}
Try this
static void Main(string[] args)
{
string str = "Hello You are welcome";
foreach (var item in str.Split(' ')) // split the string (by space)
{
if (item == "are")
{
Console.WriteLine("True");
}
}
}

How to use .Contains in LINQ using C#

I am trying to find a substring of states in a file. I open the file and load each line a string at a time. I would then like to check if each string contains one of the states in my substring. It is not working as intended as it keeps returning "Could not find substring" even though I know that the states are in the string. What am I doing wrong?
EDIT: I realise now what the error, this line was completely wrong:
if (lines.Any(stringToCheck.Contains))
It should be like this:
if (stringToCheck.Any(s.Contains))
Thanks for the help guys.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using System.IO;
namespace ConsoleApplication2
{
class Program
{
static void Main (string[] args)
{
string[] stringToCheck = {"Alabama","Alaska","Arizona","Arkansas","California","Colorado"};
string[] lines = File.ReadAllLines(#"C:\C# Project\sampledata.dat");
foreach (string s in lines)
{
if (lines.Any(stringToCheck.Contains))
{
Console.WriteLine("Found substring");
Console.WriteLine(s);
Console.ReadLine();
}
else
Console.WriteLine("Could not find substring");
Console.WriteLine(s);
Console.ReadLine() ;
}
}
}
}
You could use Any over the list of states to check if there is any string for each state which contains on the line. For sample:
if (stringToCheck.Any(x = > s.Contains(x))
{
// ...
}
You could do something like this:
string[] stringToCheck = { "Alabama", "Alaska", "Arizona", "Arkansas", "California", "Colorado" };
//Some test lines
string[] lines = { "sadnaskjd Alabama", "sadasd Arizona", "asdasdaer" };
//A bool telling me if I found anything,
//you could skip the bool and just use else in the foreach :)
bool contains = false;
foreach ( var check in stringToCheck )
{
if ( lines.Any( l => l.Contains( check ) ) )
{
Console.WriteLine( "Found substring" );
Console.WriteLine( s );
contains = true;
}
}
if ( contains == false )
{
Console.WriteLine("Could not find substring");
Console.ReadLine();
}
Try this idea:
if (line.Any(x=>CheckKeyWordsInLine(line, stringToCheck)){
//The line contains one of the keywords
}else{
//The line does not contain any of the keywords
}
Then define the CheckKeyWordsInLine function as:
bool CheckKeyWordsInLine(string line, string[] keywords){
foreach(var key in keywords)
{
if(line.Contains(key))
{
return true;
}
}
return false;
}
you could do something like this:
if(stringToCheck.Any(e => s.Contains(e)){
//
}

C# program that dumps the entire HKLM registry tree to the console?

I'm trying to write a simple console app that dumps the contents of HKLM to the console. The output should look something like:
HKEY_LOCAL_MACHINE
HKEY_LOCAL_MACHINE\BCD00000000
HKEY_LOCAL_MACHINE\BCD00000000\Description
KeyName: BCD00000000
System: 1
TreatAsSystem: 1
GuidCache: System.Byte[]
HKEY_LOCAL_MACHINE\BCD00000000\Objects
HKEY_LOCAL_MACHINE\BCD00000000\Objects\{0ce4991b-e6b3-4b16-b23c-5e0d9250e5d9}
HKEY_LOCAL_MACHINE\BCD00000000\Objects\{0ce4991b-e6b3-4b16-b23c-5e0d9250e5d9}\Description
Type: 537919488
HKEY_LOCAL_MACHINE\BCD00000000\Objects\{0ce4991b-e6b3-4b16-b23c-5e0d9250e5d9}\Elements
HKEY_LOCAL_MACHINE\BCD00000000\Objects\{0ce4991b-e6b3-4b16-b23c-5e0d9250e5d9}\Elements\16000020
Element: System.Byte[]
I haven't had much luck researching how to do this. Any help would be greatly appreciated.
You know there's already an app that dumps registry contents, right?
REG EXPORT HKLM hklm.reg
Fun part is, it exports the keys in a text format, but that text file can be imported using either REG or the registry editor.
cHao way is the safiest approach to your question. In the meanwhile, I was bored on this sunday night and wrote something. Just change the Console.WriteLine or add a few other Console.WriteLine to suit your need, whatever need there is.
class Program
{
static void Main(string[] args)
{
Registry.CurrentUser.GetSubKeyNames()
.Select(x => Registry.CurrentUser.OpenSubKey(x))
.Traverse(key =>
{
if (key != null)
{
// You will most likely hit some security exception
return key.GetSubKeyNames().Select(subKey => key.OpenSubKey(subKey));
}
return null;
})
.ForEach(key =>
{
key.GetValueNames()
.ForEach(valueName => Console.WriteLine("{0}\\{1}:{2} ({3})", key, valueName, key.GetValue(valueName), key.GetValueKind(valueName)));
});
Console.ReadLine();
}
}
public static class Extensions
{
public static IEnumerable<T> Traverse<T>(this IEnumerable<T> source, Func<T, IEnumerable<T>> fnRecurse)
{
foreach (T item in source)
{
yield return item;
IEnumerable<T> seqRecurse = fnRecurse(item);
if (seqRecurse != null)
{
foreach (T itemRecurse in Traverse(seqRecurse, fnRecurse))
{
yield return itemRecurse;
}
}
}
}
public static void ForEach<T>(this IEnumerable<T> source, Action<T> action)
{
foreach (var item in source)
{
action(item);
}
}
}
thanks for the answer Pierre-Alain Vigeant, i like ur solution. for the most part it worked with a couple of minor alterations for the text formatting, but i still couldnt deal with the security exception that was being thrown. turns out linq is not so great for this because it does alot of behind the scenes stuff. the following solution is a basic idea of how to do it
class Program
{
static void Main(string[] args)
{
RegistryKey key = Registry.LocalMachine;
Traverse(key, 0);
key.Close();
Console.Read();
}
private static void Traverse(RegistryKey key, int indent)
{
Console.WriteLine(key.Name);
string[] names = key.GetSubKeyNames();
foreach (var subkeyname in names)
{
try
{
string[] valnames = key.GetValueNames();
foreach (string valname in valnames)
{
Console.WriteLine(returnIndentions(indent)+valname + ":" + key.GetValue(valname));
}
Traverse(key.OpenSubKey(subkeyname),indent++);
}
catch {
//do nothing
}
}
}
private static string returnIndentions(int indent)
{
string indentions = "";
for (int i = 0; i < indent; i++) {
indentions += " ";
}
return indentions;
}
}
using System;
using System.Text;
using Microsoft.Win32;
class Program
{
static void Main(string[] args)
{
using RegistryKey key = Registry.LocalMachine;
string keyName = args[0]; // eg #"SOFTWARE\Microsoft\Speech\Voices"
var sb = new StringBuilder();
var subKey = key.OpenSubKey(keyName);
Traverse(subKey);
void Traverse(RegistryKey key, int indent = 0)
{
sb.AppendLine(new string(' ', Math.Max(0, indent - 2)) + key.Name);
indent++;
string[] valnames = key.GetValueNames();
foreach (string valname in valnames)
{
sb.AppendLine(new string(' ', indent) + valname + " : " + key.GetValue(valname));
}
string[] names = key.GetSubKeyNames();
foreach (var subkeyname in names)
{
Traverse(key.OpenSubKey(subkeyname), indent + 2);
}
}
Console.WriteLine(sb.ToString());
}
}

String Matching

I have a string
String mainString="///BUY/SELL///ORDERTIME///RT///QTY///BROKERAGE///NETRATE///AMOUNTRS///RATE///SCNM///";
Now I have another strings
String str1= "RT";
which should be matched only with RT which is substring of string mainString but not with ORDERTIME which is also substring of string mainString.
String str2= "RATE" ;
And RATE(str2) should be matched with RATE which is substring of string mainString but not with NETRATE which is also substring of string mainString.
How can we do that ?
Match against "///RT///" and "///RATE///".
This might give you some clues - no where near real code quality, and only a 5 minute job to come with this shoddy solution but does do what you need. it smells lots be warned.
using System;
using System.Collections.Generic;
using System.Collections;
using System.Linq;
using System.Text;
namespace test {
class Program {
static void Main(string[] args) {
String mainString="//BUY/SELL//ORDERTIME//RT//QTY//BROKERAGE//NETRATE//AMOUNTRS//RATE//SCNM//";
Hashtable ht = createHashTable(mainString);
if (hasValue("RA", ht)) {
Console.WriteLine("Matched RA");
} else {
Console.WriteLine("Didnt Find RA");
}
if (hasValue("RATE", ht)) {
Console.WriteLine("Matched RATE");
}
Console.Read();
}
public static Hashtable createHashTable(string strToSplit) {
Hashtable ht = new Hashtable();
int iCount = 0;
string[] words = strToSplit.Split(new Char[] { '/', '/', '/' });
foreach (string word in words) {
ht.Add(iCount++, word);
}
return ht;
}
public static bool hasValue(string strValuetoSearch, Hashtable ht) {
return ht.ContainsValue(strValuetoSearch);
}
}
}
as far as I understand your question you want to match a string between /// as delimiters.
if you look for str you just have to do
Regex.Match(mainString, "(^|///)" + str + "(///|$)");
I don't know it will work every time or not.But I have tried this and it works right now in this string matching. I want to know whether this is ok or not,please give me suggestion.
str1 = str1.Insert(0, "///");
str1=str1.Insert(str1.Length,"///");
bool Result = mainString.Contains(str1);
What about Linq to Object?
String mainString="///BUY/SELL///ORDERTIME///RT///QTY///BROKERAGE///NETRATE///AMOUNTRS///RATE///SCNM///";
String searchTerm = "RT";
String[] src = mainString.split('///');
var match = from word in src where
word.ToLowerInvariant() == searchTerm.ToLowerInvariant()
select word;
I don't have VS near me so I can't test it, but it should be something similar to this.

C# Sanitize File Name

I recently have been moving a bunch of MP3s from various locations into a repository. I had been constructing the new file names using the ID3 tags (thanks, TagLib-Sharp!), and I noticed that I was getting a System.NotSupportedException:
"The given path's format is not supported."
This was generated by either File.Copy() or Directory.CreateDirectory().
It didn't take long to realize that my file names needed to be sanitized. So I did the obvious thing:
public static string SanitizePath_(string path, char replaceChar)
{
string dir = Path.GetDirectoryName(path);
foreach (char c in Path.GetInvalidPathChars())
dir = dir.Replace(c, replaceChar);
string name = Path.GetFileName(path);
foreach (char c in Path.GetInvalidFileNameChars())
name = name.Replace(c, replaceChar);
return dir + name;
}
To my surprise, I continued to get exceptions. It turned out that ':' is not in the set of Path.GetInvalidPathChars(), because it is valid in a path root. I suppose that makes sense - but this has to be a pretty common problem. Does anyone have some short code that sanitizes a path? The most thorough I've come up with this, but it feels like it is probably overkill.
// replaces invalid characters with replaceChar
public static string SanitizePath(string path, char replaceChar)
{
// construct a list of characters that can't show up in filenames.
// need to do this because ":" is not in InvalidPathChars
if (_BadChars == null)
{
_BadChars = new List<char>(Path.GetInvalidFileNameChars());
_BadChars.AddRange(Path.GetInvalidPathChars());
_BadChars = Utility.GetUnique<char>(_BadChars);
}
// remove root
string root = Path.GetPathRoot(path);
path = path.Remove(0, root.Length);
// split on the directory separator character. Need to do this
// because the separator is not valid in a filename.
List<string> parts = new List<string>(path.Split(new char[]{Path.DirectorySeparatorChar}));
// check each part to make sure it is valid.
for (int i = 0; i < parts.Count; i++)
{
string part = parts[i];
foreach (char c in _BadChars)
{
part = part.Replace(c, replaceChar);
}
parts[i] = part;
}
return root + Utility.Join(parts, Path.DirectorySeparatorChar.ToString());
}
Any improvements to make this function faster and less baroque would be much appreciated.
To clean up a file name you could do this
private static string MakeValidFileName( string name )
{
string invalidChars = System.Text.RegularExpressions.Regex.Escape( new string( System.IO.Path.GetInvalidFileNameChars() ) );
string invalidRegStr = string.Format( #"([{0}]*\.+$)|([{0}]+)", invalidChars );
return System.Text.RegularExpressions.Regex.Replace( name, invalidRegStr, "_" );
}
A shorter solution:
var invalids = System.IO.Path.GetInvalidFileNameChars();
var newName = String.Join("_", origFileName.Split(invalids, StringSplitOptions.RemoveEmptyEntries) ).TrimEnd('.');
Based on Andre's excellent answer but taking into account Spud's comment on reserved words, I made this version:
/// <summary>
/// Strip illegal chars and reserved words from a candidate filename (should not include the directory path)
/// </summary>
/// <remarks>
/// http://stackoverflow.com/questions/309485/c-sharp-sanitize-file-name
/// </remarks>
public static string CoerceValidFileName(string filename)
{
var invalidChars = Regex.Escape(new string(Path.GetInvalidFileNameChars()));
var invalidReStr = string.Format(#"[{0}]+", invalidChars);
var reservedWords = new []
{
"CON", "PRN", "AUX", "CLOCK$", "NUL", "COM0", "COM1", "COM2", "COM3", "COM4",
"COM5", "COM6", "COM7", "COM8", "COM9", "LPT0", "LPT1", "LPT2", "LPT3", "LPT4",
"LPT5", "LPT6", "LPT7", "LPT8", "LPT9"
};
var sanitisedNamePart = Regex.Replace(filename, invalidReStr, "_");
foreach (var reservedWord in reservedWords)
{
var reservedWordPattern = string.Format("^{0}\\.", reservedWord);
sanitisedNamePart = Regex.Replace(sanitisedNamePart, reservedWordPattern, "_reservedWord_.", RegexOptions.IgnoreCase);
}
return sanitisedNamePart;
}
And these are my unit tests
[Test]
public void CoerceValidFileName_SimpleValid()
{
var filename = #"thisIsValid.txt";
var result = PathHelper.CoerceValidFileName(filename);
Assert.AreEqual(filename, result);
}
[Test]
public void CoerceValidFileName_SimpleInvalid()
{
var filename = #"thisIsNotValid\3\\_3.txt";
var result = PathHelper.CoerceValidFileName(filename);
Assert.AreEqual("thisIsNotValid_3__3.txt", result);
}
[Test]
public void CoerceValidFileName_InvalidExtension()
{
var filename = #"thisIsNotValid.t\xt";
var result = PathHelper.CoerceValidFileName(filename);
Assert.AreEqual("thisIsNotValid.t_xt", result);
}
[Test]
public void CoerceValidFileName_KeywordInvalid()
{
var filename = "aUx.txt";
var result = PathHelper.CoerceValidFileName(filename);
Assert.AreEqual("_reservedWord_.txt", result);
}
[Test]
public void CoerceValidFileName_KeywordValid()
{
var filename = "auxillary.txt";
var result = PathHelper.CoerceValidFileName(filename);
Assert.AreEqual("auxillary.txt", result);
}
string clean = String.Concat(dirty.Split(Path.GetInvalidFileNameChars()));
there are a lot of working solutions here. just for the sake of completeness, here's an approach that doesn't use regex, but uses LINQ:
var invalids = Path.GetInvalidFileNameChars();
filename = invalids.Aggregate(filename, (current, c) => current.Replace(c, '_'));
Also, it's a very short solution ;)
I'm using the System.IO.Path.GetInvalidFileNameChars() method to check invalid characters and I've got no problems.
I'm using the following code:
foreach( char invalidchar in System.IO.Path.GetInvalidFileNameChars())
{
filename = filename.Replace(invalidchar, '_');
}
I wanted to retain the characters in some way, not just simply replace the character with an underscore.
One way I thought was to replace the characters with similar looking characters which are (in my situation), unlikely to be used as regular characters. So I took the list of invalid characters and found look-a-likes.
The following are functions to encode and decode with the look-a-likes.
This code does not include a complete listing for all System.IO.Path.GetInvalidFileNameChars() characters. So it is up to you to extend or utilize the underscore replacement for any remaining characters.
private static Dictionary<string, string> EncodeMapping()
{
//-- Following characters are invalid for windows file and folder names.
//-- \/:*?"<>|
Dictionary<string, string> dic = new Dictionary<string, string>();
dic.Add(#"\", "Ì"); // U+OOCC
dic.Add("/", "Í"); // U+OOCD
dic.Add(":", "¦"); // U+00A6
dic.Add("*", "¤"); // U+00A4
dic.Add("?", "¿"); // U+00BF
dic.Add(#"""", "ˮ"); // U+02EE
dic.Add("<", "«"); // U+00AB
dic.Add(">", "»"); // U+00BB
dic.Add("|", "│"); // U+2502
return dic;
}
public static string Escape(string name)
{
foreach (KeyValuePair<string, string> replace in EncodeMapping())
{
name = name.Replace(replace.Key, replace.Value);
}
//-- handle dot at the end
if (name.EndsWith(".")) name = name.CropRight(1) + "°";
return name;
}
public static string UnEscape(string name)
{
foreach (KeyValuePair<string, string> replace in EncodeMapping())
{
name = name.Replace(replace.Value, replace.Key);
}
//-- handle dot at the end
if (name.EndsWith("°")) name = name.CropRight(1) + ".";
return name;
}
You can select your own look-a-likes. I used the Character Map app in windows to select mine %windir%\system32\charmap.exe
As I make adjustments through discovery, I will update this code.
I think the problem is that you first call Path.GetDirectoryName on the bad string. If this has non-filename characters in it, .Net can't tell which parts of the string are directories and throws. You have to do string comparisons.
Assuming it's only the filename that is bad, not the entire path, try this:
public static string SanitizePath(string path, char replaceChar)
{
int filenamePos = path.LastIndexOf(Path.DirectorySeparatorChar) + 1;
var sb = new System.Text.StringBuilder();
sb.Append(path.Substring(0, filenamePos));
for (int i = filenamePos; i < path.Length; i++)
{
char filenameChar = path[i];
foreach (char c in Path.GetInvalidFileNameChars())
if (filenameChar.Equals(c))
{
filenameChar = replaceChar;
break;
}
sb.Append(filenameChar);
}
return sb.ToString();
}
I have had success with this in the past.
Nice, short and static :-)
public static string returnSafeString(string s)
{
foreach (char character in Path.GetInvalidFileNameChars())
{
s = s.Replace(character.ToString(),string.Empty);
}
foreach (char character in Path.GetInvalidPathChars())
{
s = s.Replace(character.ToString(), string.Empty);
}
return (s);
}
Here's an efficient lazy loading extension method based on Andre's code:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
namespace LT
{
public static class Utility
{
static string invalidRegStr;
public static string MakeValidFileName(this string name)
{
if (invalidRegStr == null)
{
var invalidChars = System.Text.RegularExpressions.Regex.Escape(new string(System.IO.Path.GetInvalidFileNameChars()));
invalidRegStr = string.Format(#"([{0}]*\.+$)|([{0}]+)", invalidChars);
}
return System.Text.RegularExpressions.Regex.Replace(name, invalidRegStr, "_");
}
}
}
Your code would be cleaner if you appended the directory and filename together and sanitized that rather than sanitizing them independently. As for sanitizing away the :, just take the 2nd character in the string. If it is equal to "replacechar", replace it with a colon. Since this app is for your own use, such a solution should be perfectly sufficient.
using System;
using System.IO;
using System.Linq;
using System.Text;
public class Program
{
public static void Main()
{
try
{
var badString = "ABC\\DEF/GHI<JKL>MNO:PQR\"STU\tVWX|YZA*BCD?EFG";
Console.WriteLine(badString);
Console.WriteLine(SanitizeFileName(badString, '.'));
Console.WriteLine(SanitizeFileName(badString));
}
catch (Exception ex)
{
Console.WriteLine(ex.ToString());
}
}
private static string SanitizeFileName(string fileName, char? replacement = null)
{
if (fileName == null) { return null; }
if (fileName.Length == 0) { return ""; }
var sb = new StringBuilder();
var badChars = Path.GetInvalidFileNameChars().ToList();
foreach (var #char in fileName)
{
if (badChars.Contains(#char))
{
if (replacement.HasValue)
{
sb.Append(replacement.Value);
}
continue;
}
sb.Append(#char);
}
return sb.ToString();
}
}
Based #fiat's and #Andre's approach, I'd like to share my solution too.
Main difference:
its an extension method
regex is compiled at first use to save some time with a lot executions
reserved words are preserved
public static class StringPathExtensions
{
private static Regex _invalidPathPartsRegex;
static StringPathExtensions()
{
var invalidReg = System.Text.RegularExpressions.Regex.Escape(new string(Path.GetInvalidFileNameChars()));
_invalidPathPartsRegex = new Regex($"(?<reserved>^(CON|PRN|AUX|CLOCK\\$|NUL|COM0|COM1|COM2|COM3|COM4|COM5|COM6|COM7|COM8|COM9|LPT0|LPT1|LPT2|LPT3|LPT4|LPT5|LPT6|LPT7|LPT8|LPT9))|(?<invalid>[{invalidReg}:]+|\\.$)", RegexOptions.Compiled);
}
public static string SanitizeFileName(this string path)
{
return _invalidPathPartsRegex.Replace(path, m =>
{
if (!string.IsNullOrWhiteSpace(m.Groups["reserved"].Value))
return string.Concat("_", m.Groups["reserved"].Value);
return "_";
});
}
}

Categories

Resources