C# preg_replace? - c#

What is the PHP preg_replace in C#?
I have an array of string that I would like to replace by an other array of string. Here is an example in PHP. How can I do something like that in C# without using .Replace("old","new").
$patterns[0] = '/=C0/';
$patterns[1] = '/=E9/';
$patterns[2] = '/=C9/';
$replacements[0] = 'à';
$replacements[1] = 'é';
$replacements[2] = 'é';
return preg_replace($patterns, $replacements, $text);

Real men use regular expressions, but here is an extension method that adds it to String if you wanted it:
public static class ExtensionMethods
{
public static String PregReplace(this String input, string[] pattern, string[] replacements)
{
if (replacements.Length != pattern.Length)
throw new ArgumentException("Replacement and Pattern Arrays must be balanced");
for (var i = 0; i < pattern.Length; i++)
{
input = Regex.Replace(input, pattern[i], replacements[i]);
}
return input;
}
}
You use it like this:
class Program
{
static void Main(string[] args)
{
String[] pattern = new String[4];
String[] replacement = new String[4];
pattern[0] = "Quick";
pattern[1] = "Fox";
pattern[2] = "Jumped";
pattern[3] = "Lazy";
replacement[0] = "Slow";
replacement[1] = "Turtle";
replacement[2] = "Crawled";
replacement[3] = "Dead";
String DemoText = "The Quick Brown Fox Jumped Over the Lazy Dog";
Console.WriteLine(DemoText.PregReplace(pattern, replacement));
}
}

You can use .Select() (in .NET 3.5 and C# 3) to ease applying functions to members of a collection.
stringsList.Select( s => replacementsList.Select( r => s.Replace(s,r) ) );
You don't need regexp support, you just want an easy way to iterate over the arrays.

public static class StringManipulation
{
public static string PregReplace(string input, string[] pattern, string[] replacements)
{
if (replacements.Length != pattern.Length)
throw new ArgumentException("Replacement and Pattern Arrays must be balanced");
for (int i = 0; i < pattern.Length; i++)
{
input = Regex.Replace(input, pattern[i], replacements[i]);
}
return input;
}
}
Here is what I will use. Some code of Jonathan Holland but not in C#3.5 but in C#2.0 :)
Thx all.

You are looking for System.Text.RegularExpressions;
using System.Text.RegularExpressions;
Regex r = new Regex("=C0");
string output = r.Replace(text);
To get PHP's array behaviour the way you have you need multiple instances of `Regex
However, in your example, you'd be much better served by .Replace(old, new), it's much faster than compiling state machines.

Edit: Uhg I just realized this question was for 2.0, but I'll leave it in case you do have access to 3.5.
Just another take on the Linq thing. Now I used List<Char> instead of Char[] but that's just to make it look a little cleaner. There is no IndexOf method on arrays but there is one on List. Why did I need this? Well from what I am guessing, there is no direct correlation between the replacement list and the list of ones to be replaced. Just the index.
So with that in mind, you can do this with Char[] just fine. But when you see the IndexOf method, you have to add in a .ToList() before it.
Like this: someArray.ToList().IndexOf
String text;
List<Char> patternsToReplace;
List<Char> patternsToUse;
patternsToReplace = new List<Char>();
patternsToReplace.Add('a');
patternsToReplace.Add('c');
patternsToUse = new List<Char>();
patternsToUse.Add('X');
patternsToUse.Add('Z');
text = "This is a thing to replace stuff with";
var allAsAndCs = text.ToCharArray()
.Select
(
currentItem => patternsToReplace.Contains(currentItem)
? patternsToUse[patternsToReplace.IndexOf(currentItem)]
: currentItem
)
.ToArray();
text = new String(allAsAndCs);
This just converts the text to a character array, selects through each one. If the current character is not in the replacement list, just send back the character as is. If it is in the replacement list, return the character in the same index of the replacement characters list. Last thing is to create a string from the character array.
using System;
using System.Collections.Generic;
using System.Linq;

Related

Finding longest word in string

Ok, so I know that questions LIKE this have been asked a lot on here, but I can't seem to make solutions work.
I am trying to take a string from a file and find the longest word in that string.
Simples.
I think the issue is down to whether I am calling my methods on a string[] or char[], currently stringOfWords returns a char[].
I am trying to then order by descending length and get the first value but am getting an ArgumentNullException on the OrderByDescending method.
Any input much appreciated.
using System;
using System.Collections.Generic;
using System.IO;
using System.Linq;
using System.Runtime.CompilerServices;
using System.Text;
using System.Threading.Tasks;
namespace TextExercises
{
class Program
{
static void Main(string[] args)
{
var fileText = File.ReadAllText(#"C:\Users\RichardsPC\Documents\TestText.txt");
var stringOfWords = fileText.ToArray();
Console.WriteLine("Text in file: " + fileText);
Console.WriteLine("Words in text: " + fileText.Split(' ').Length);
// This is where I am trying to solve the problem
var finalValue = stringOfWords.OrderByDescending(n => n.length).First();
Console.WriteLine("Largest word is: " + finalValue);
}
}
}
Don't split the string, use a Regex
If you care about performance you don't want to split the string. The reason in order to do the split method will have to traverse the entire string, create new strings for the items it finds to split and put them into an array, computational cost of more than N, then doing an order by you do another (at least) O(nLog(n)) steps.
You can use a Regex for this, which will be more efficient, because it will only iterate over the string once
var regex = new Regex(#"(\w+)\s",RegexOptions.Compiled);
var match = regex.Match(fileText);
var currentLargestString = "";
while(match.Success)
{
if(match.Groups[1].Value.Length>currentLargestString.Length)
{
currentLargestString = match.Groups[1].Value;
}
match = match.NextMatch();
}
The nice thing about this is that you don't need to break the string up all at once to do the analysis and if you need to load the file incrementally is a fairly easy change to just persist the word in an object and call it against multiple strings
If you're set on using an Array don't order by just iterate over
You don't need to do an order by your just looking for the largest item, computational complexity of order by is in most cases O(nLog(n)), iterating over the list has a complexity of O(n)
var largest = "";
foreach(var item in strArr)
{
if(item.Length>largest.Length)
largest = item;
}
Method ToArray() in this case returns char[] which is an array of individual characters. But instead you need an array of individual words. You can get it like this:
string[] stringOfWords = fileText.Split(' ');
And you have a typo in your lambda expression (uppercase L):
n => n.Length
Try this:
var fileText = File.ReadAllText(#"C:\Users\RichardsPC\Documents\TestText.txt");
var words = fileText.Split(' ')
var finalValue = fileText.OrderByDescending(n=> n.Length).First();
Console.WriteLine("Longest word: " + finalValue");
As suggested in the other answer, you need to split your string.
string[] stringOfWords = fileText.split(new Char [] {',' , ' ' });
//all is well, now let's loop over it and see which is the biggest
int biggest = 0;
int biggestIndex = 0;
for(int i=0; i<stringOfWords.length; i++) {
if(biggest < stringOfWords[i].length) {
biggest = stringOfWords[i].length;
biggestIndex = i;
}
}
return stringOfWords[i];
What we're doing here is splitting the string based on whitespace (' '), or commas- you can add an unlimited number of delimiters there - each word, then, gets its own space in the array.
From there, we're iterating over the array. If we encounter a word that's longer than the current longest word, we update it.

LINQ or REGEX to extract certain text from a string

I have a string in my C# model populated with this string:
"[{\"ta_id\":97497,\"partner_id\":\"229547\",\"partner_url\":\"http://partner.com/deeplink/to/229547\"},{\"ta_id\":97832,\"partner_id\":\"id34234\",\"partner_url\":\"http://partner.com/deeplink/to/id34234\"}]"
Is there a way, using LINQ or RegEx, that I could parse out the partner_id's - so I ended up with a list object with:
229547
id34234
Thanks for your help, Mark
I have never used any JSON parser but if it comes to Regex you could try something like this:
private static void regexString()
{
string myString = "[{\"ta_id\":97497,\"partner_id\":\"229547\",\"partner_url\":\"http://partner.com/deeplink/to/229547\"},{\"ta_id\":97832,\"partner_id\":\"id34234\",\"partner_url\":\"http://partner.com/deeplink/to/id34234\"}]";
string[] stringList = Regex.Split(myString, "},{");
for (int i=0; i<stringList.Length ;i++)
{
stringList[i] = Regex.Split(Regex.Split(stringList[i], "partner_id\\\":\\\"")[1], "\\\",\\\"partner_url\\\"")[0];
}
}
Also there is a nice website to help you with creating your own regex patterns in the future, check it out:
gskinner.com
And a nice and short tutorial:
www.codeproject.com
Assuming your link having partner id always-
string Name = "[{\"ta_id\":97497,\"partner_id\":\"229547\",\"partner_url\":\"http://partner.com/deeplink/to/229547\"},{\"ta_id\":97832,\"partner_id\":\"id34234\",\"partner_url\":\"http://partner.com/deeplink/to/id34234\"}]";
string[] splittedString = Regex.Split(Name, "}");
List<string> allIds = new List<string>();
foreach (var i in splittedString)
{
var ids =Regex.Split(i, "/");
string id = ids[ids.Length - 1];
allIds.Add(id);
}
If that is the general format of the string then this regex should work
(?i)(?<=(partner_id).{5})\w+
Test here
This from your string will get
229547 and id34234
(?i) = Case insesitivity
(?<=(partner_id).{5}) = postive lookbehind for parter_id then any 5 characters which in this case will be \":\"
\w+ = Any alphanumeric characters one or more times
Hope this helped
Since this is JSON, you probably shouldn't bother trying to get a regex working. Instead, you can parse the JSON and then use LINQ.
using System.Web.Script.Serialization; // (in System.Web.Extensions.dll)
...
string s = "[{\"ta_id\":97497,\"partner_id\":\"229547\",\"partner_url\":\"http://partner.com/deeplink/to/229547\"},{\"ta_id\":97832,\"partner_id\":\"id34234\",\"partner_url\":\"http://partner.com/deeplink/to/id34234\"}]";
JavaScriptSerializer j = new JavaScriptSerializer();
object[] objects = (object[])j.DeserializeObject(s);
string[] ids = objects.Cast<Dictionary<string, object>>()
.Select(dict => (string)dict["partner_id"])
.ToArray();
It's a little messy to deserialize it to an object, because you don't have any type information. If you're not afraid of making a small class to deserialize into, you can do something like this:
class Foo
{
public string partner_id
{
get;
set;
}
}
...
JavaScriptSerializer j = new JavaScriptSerializer();
string[] ids = j.Deserialize<Foo[]>(s).Select(x => x.partner_id).ToArray();
Note that there are other options for deserializing JSON. I simply chose the most general-purpose one that's built in.

C# Best way to retrieve strings that's in quotation mark?

Suppose I am given a following text (in a string array)
engine.STEPCONTROL("00000000","02000001","02000043","02000002","02000007","02000003","02000008","02000004","02000009","02000005","02000010","02000006","02000011");
if("02000001" == 1){
dimlevel = 1;
}
if("02000001" == 2){
dimlevel = 3;
}
I'd like to extract the strings that's in between the quotation mark and put it in a separate string array. For instance, string[] extracted would contain 00000000, 02000001, 02000043....
What is the best approach for this? Should I use regular expression to somehow parse those lines and split it?
Personally I don't think a regular expression is necessary. If you can be sure that the input string is always as described and will not have any escape sequences in it or vary in any other way, you could use something like this:
public static string[] ExtractNumbers(string[] originalCodeLines)
{
List<string> extractedNumbers = new List<string>();
string[] codeLineElements = originalCodeLines[0].Split('"');
foreach (string element in codeLineElements)
{
int result = 0;
if (int.TryParse(element, out result))
{
extractedNumbers.Add(element);
}
}
return extractedNumbers.ToArray();
}
It's not necessarily the most efficient implementation but it's quite short and its easy to see what it does.
that could be
string data = "\"00000000\",\"02000001\",\"02000043\"".Replace("\"", string.Empty);
string[] myArray = data.Split(',');
or in 1 line
string[] data = "\"00000000\",\"02000001\",\"02000043\"".Replace("\"", string.Empty).Split(',');

C#/.NET: Reformatting a very long string

I need to read a string, character by character, and build a new string as the output.
What's the best approach to do this in C#?
Use a StringBuilder? Use some writer/stream?
Note that there will be no I/O operations--this is strictly an in-memory transformation.
If the size of the string cannot be determined at compile time and it may also be relatively large, you should use a StringBuilder for concatenation as it acts like a mutable string.
var input = SomeLongString;
// may as well initialize the capacity as well
// as the length will be 1 to 1 with the unprocessed input.
var sb = new StringBuilder( input.Length );
foreach( char c in input )
{
sb.Append( Process( c ) );
}
if it's just one string you can use a collection to hold your characters and then just create the string using the constructor:
IEnumerable<char> myChars = ...;
string result = new string(myChars);
Using Linq and with the help of a method ProcessChar(char c) that transforms each character to its output value this could be just a query transformation (using the string constructor that takes an IEnumerable<char> as input):
string result = new string(sourceString.Select(c => ProcessChar(c)));
This is as efficient as using a StringBuilder (since StringBuilder is used internally in the string class to construct the string from the IEnumerable), but much more readable in my opinion.
Stringbuilder is usually a pretty good bet. I've written lots of javascript in webpages using it.
A StringBuilder is good idea for building your new string, because you can efficiently append new values to it. As for reading the characters from the input string, a StringReader would be a sufficient choice.
void Main()
{
string myLongString = "lf;kajsd;lfkjal;dfkja;lkdfja;lkdjf;alkjdfa";
var transformedTString = string.Join(string.Empty, myLongString.ToCharArray().Where(x => x != ';'));
transformedTString.Dump();
}
If you have more complicated logic you can move your validation to separate predicated method
void Main()
{
string myLongString = "lf;kajsd;lfkjal;dfkja;lkdfja;lkdjf;alkjdfa";
var transformedTString = string.Join(string.Empty, myLongString.ToCharArray().Where(MyPredicate));
transformedTString.Dump();
}
public bool MyPredicate(char c)
{
return c != ';';
}
What's the difference between read string and output string? I mean why do you have to read char by char?
I use this method for reading string
string str = "some stuff";
string newStr = ToNewString(str);
string ToNewString(string arg)
{
string r = string.Empty;
foreach (char c in arg)
r += DoWork(c);
return r;
}
char DoWorK(char arg)
{
// What do you want to do here?
}

C# Regex Split To Java Pattern split

I have to port some C# code to Java and I am having some trouble converting a string splitting command.
While the actual regex is still correct, when splitting in C# the regex tokens are part of the resulting string[], but in Java the regex tokens are removed.
What is the easiest way to keep the split-on tokens?
Here is an example of C# code that works the way I want it:
using System;
using System.Text.RegularExpressions;
class Program
{
static void Main()
{
String[] values = Regex.Split("5+10", #"([\+\-\*\(\)\^\\/])");
foreach (String value in values)
Console.WriteLine(value);
}
}
Produces:
5
+
10
I don't know how C# does it, but to accomplish it in Java, you'll have to approximate it. Look at how this code does it:
public String[] split(String text) {
if (text == null) {
text = "";
}
int last_match = 0;
LinkedList<String> splitted = new LinkedList<String>();
Matcher m = this.pattern.matcher(text);
// Iterate trough each match
while (m.find()) {
// Text since last match
splitted.add(text.substring(last_match,m.start()));
// The delimiter itself
if (this.keep_delimiters) {
splitted.add(m.group());
}
last_match = m.end();
}
// Trailing text
splitted.add(text.substring(last_match));
return splitted.toArray(new String[splitted.size()]);
}
This is because you are capturing the split token. C# takes this as a hint that you wish to retain the token itself as a member of the resulting array. Java does not support this.

Categories

Resources