Unable to collect substring from a string - c#

I am extracting a substring from a string that comes from a word file. But I am getting an error of index out of range even if the starting and ending index of substring is less then the length of the string.
for(int i=0;i<y.Length-1;i++)
{
if (Regex.IsMatch(y[i], #"^[A]"))
{
NumberOfWords= y[i].Split(' ').Length;
if (NumberOfWords > 5)
{
int le = y[i].Length;
int indA = y[i].IndexOf("A");
int indB = y[i].IndexOf("B");
int indC = y[i].IndexOf("C");
int indD = y[i].IndexOf("D");
//if (indD > 1 && indC > 1)
// breakop2 = breakop2 + '\n' + '\n' + y[i].Substring(indC, indD);
if (indC > 1 && indB > 1)
breakop1 = breakop1 + '\n' + y[i].Substring(indB, indC);
if (indB > 1)
sr = y[i].Substring(indA, indB);
else
sr = y[i];
breakop = breakop +'\n'+'\n'+ sr;
Acount++;
//textBox1.Text = s[i];
check1 = check1 + '\n' + '\n' + y[i];
//i++;
}
}
}

String.Substring(int, int) doesn't take a start index and an end index (as it does in Java); it takes a start index and a length. So perhaps you want:
sr = y[i].Substring(indA, indB - indA);
But you should also check that indB is greater than indA. (You need to work out how you want this to behave if B comes before A, basically.)
You'd also need to apply the same behaviour for the Substring(indB, indC).

The String.Substring method takes a starting index and a length. You are passing in two indices.

Related

Missing characters after string concatenate

I am having a problem whereby the letter at the position(e.g 39) would be replaced with the text I wanted to input. However what I want was to insert the text at position 39 instead of replacing it. Anyone please guide me on this.
string description = variables[1]["value"].ToString();// where I get the text
int nInterval = 39;// for every 39 characters in the text I would have a newline
string res = String.Concat(description.Select((c, z) => z > 0 && (z % nInterval) == 0 ? Environment.NewLine +"Hello"+(((z/ nInterval)*18)+83).ToString()+"world": c.ToString()));
file_lines = file_lines.Replace("<<<terms_conditions>>>",resterms); //file_lines is where I read the text file
Original text
Present this redemption slip to receive: One
After String.Concat
Present this redemption slip to receive\r\n\u001bHello101world
One //: is gone
I am also having a issue where I want to put a new line if it contains * in the text. If anybody is able to help that would be great.
Edit:
What I want to achieve is something like this
Input
*Item is considered paid if unsealed.*No replacement or compensation will be given for any expired coupons.
so like i need to find every 39 character and also * to input newline so it will be
Output
*Item is considered paid if unsealed.
*No replacement or compensation will be
given for any expired coupons.
Try String.Insert(Int32, String) Method
Insert \n where you need new line.
If I understood your question properly, you want a newline after every 39 characters. You can use string.Insert(Int32, String) method for that.
And use String.Replace(String, String) for your * problem.
Below code snippet doing that using a simple for loop.
string sampleStr = "Lorem Ipsum* is simply..";
for (int i = 39; i < sampleStr.Length; i = i + 39){
sampleStr = sampleStr.Insert(i, Environment.NewLine);
}
//sampleStr = sampleStr.Replace("*", Environment.NewLine);
int[] indexes = Enumerable.Range(0, sampleStr.Length).Where(x => sampleStr[x] == '*').ToArray();
for (int i = 0; i < indexes.Length; i++)
{
int position = indexes[i];
if (position > 0) sampleStr = sampleStr.Insert(position, Environment.NewLine);
}
If you want to do both together
int[] indexes = Enumerable.Range(0, sampleStr.Length).Where(x => sampleStr[x] == '*' || x % 39 == 0).ToArray();
int j = 0;
foreach (var position in indexes)
{
if (position > 0)
{
sampleStr = sampleStr.Insert(position + j, Environment.NewLine);
j = j + 2; // increment by two since newline will take two chars
}
}
Without debating the method chosen to achieve the desired result, the problem with the code is that at the 39th character it adds some text, but the character itself has been forgotten.
Changing the following line should give the expected output.
string res = String.Concat(description.Select((c, z) => z > 0 && (z % nInterval) == 0 ? Environment.NewLine + "Hello" + (((z / nInterval) * 18) + 83).ToString() + "world" + c.ToString() : c.ToString()));
<== UPDATED ANSWER BASED ON CLARIFICATION IN QUESTION ==>
This will do what you want, I believe. See comments in line.
var description = "*Item is considered paid if unsealed.*No replacement or compensation will be given for any expired coupons.";
var nInterval = 39; // for every 39 characters in the text I would have a newline
var newline = "\r\n"; // for clarity in the Linq statement. Can be set to Environment.Newline if desired.
var z = 0; // we'll handle the count manually.
var res = string.Concat(
description.Select(
(c) => (++z == nInterval || c == '*') // increment z and check if we've hit the boundary OR if we've hit a *
&& ((z = 0)==0) // resetting the count - this only happens if the first condition was true
? newline + (c == ' ' ? string.Empty : c.ToString()) // if the first character of a newline is a space, we don't need it
: c.ToString()
));
Output:
*Item is considered paid if unsealed.
*No replacement or compensation will be
given for any expired coupons.

How to parse below string in C#?

Please someone to help me to parse these sample string below? I'm having difficulty to split the data and also the data need to add carriage return at the end of every event
sample string:
L,030216,182748,00,FF,I,00,030216,182749,00,FF,I,00,030216,182750,00,FF,I,00
batch of events
expected output:
L,030216,182748,00,FF,I,00 - 1st Event
L,030216,182749,00,FF,I,00 - 2nd Event
L,030216,182750,00,FF,I,00 - 3rd Event
Seems like an easy problem. Something as easy as this should do it:
string line = "L,030216,182748,00,FF,I,00,030216,182749,00,FF,I,00,030216,182750,00,FF,I,00";
string[] array = line.Split(',');
StringBuilder sb = new StringBuilder();
for(int i=0; i<array.Length-1;i+=6)
{
sb.AppendLine(string.Format("{0},{1} - {2} event",array[0],string.Join(",",array.Skip(i+1).Take(6)), "number"));
}
output (sb.ToString()):
L,030216,182748,00,FF,I,00 - number event
L,030216,182749,00,FF,I,00 - number event
L,030216,182750,00,FF,I,00 - number event
All you have to do is work on the function that increments the ordinals (1st, 2nd, etc), but that's easy to get.
This should do the trick, given there are no more L's inside your string, and the comma place is always the sixth starting from the beginning of the batch number.
class Program
{
static void Main(string[] args)
{
String batchOfevents = "L,030216,182748,00,FF,I,00,030216,182749,00,FF,I,00,030216,182750,00,FF,I,00,030216,182751,00,FF,I,00,030216,182752,00,FF,I,00,030216,182753,00,FF,I,00";
// take out the "L," to start processing by finding the index of the correct comma to slice.
batchOfevents = batchOfevents.Substring(2);
String output = "";
int index = 0;
int counter = 0;
while (GetNthIndex(batchOfevents, ',', 6) != -1)
{
counter++;
if (counter == 1){
index = GetNthIndex(batchOfevents, ',', 6);
output += "L, " + batchOfevents.Substring(0, index) + " - 1st event\n";
batchOfevents = batchOfevents.Substring(index + 1);
} else if (counter == 2) {
index = GetNthIndex(batchOfevents, ',', 6);
output += "L, " + batchOfevents.Substring(0, index) + " - 2nd event\n";
batchOfevents = batchOfevents.Substring(index + 1);
}
else if (counter == 3)
{
index = GetNthIndex(batchOfevents, ',', 6);
output += "L, " + batchOfevents.Substring(0, index) + " - 3rd event\n";
batchOfevents = batchOfevents.Substring(index + 1);
} else {
index = GetNthIndex(batchOfevents, ',', 6);
output += "L, " + batchOfevents.Substring(0, index) + " - " + counter + "th event\n";
batchOfevents = batchOfevents.Substring(index + 1);
}
}
output += "L, " + batchOfevents + " - " + (counter+1) + "th event\n";
Console.WriteLine(output);
}
public static int GetNthIndex(string s, char t, int n)
{
int count = 0;
for (int i = 0; i < s.Length; i++)
{
if (s[i] == t)
{
count++;
if (count == n)
{
return i;
}
}
}
return -1;
}
}
Now the output will be in the format you asked for, and the original string has been decomposed.
NOTE: the getNthIndex method was taken from this old post.
If you want to split the string into multiple strings, you need a set of rules,
which are implementable. In your case i would start splitting the complete
string by the given comma , and than go though the elements in a loop.
All the strings in the loop will be appended in a StringBuilder. If your ruleset
say you need a new line, just add it via yourBuilder.Append('\r\n') or use AppendLine.
EDIT
Using this method, you can also easily add new chars like L or at the end rd Event
Look for the start index of 00,FF,I,00 in the entire string.
Extract a sub string starting at 0 and index plus 10 which is the length of the characters in 1.
Loop through it again each time with a new start index where you left of in 2.
Add a new line character each time.
Have a try the following:
string stream = "L,030216,182748,00,FF,I,00, 030216,182749,00,FF,I,00, 030216,182750,00,FF,I,00";
string[] lines = SplitLines(stream, "L", "I", ",");
Here the SplitLines function is implemented to detect variable-length events within the arbitrary-formatted stream:
string stream = "A;030216;182748 ;00;FF;AA;01; 030216;182749;AA;02";
string[] lines = SplitLines(batch, "A", "AA", ";");
Split-rules are:
- all elements of input stream are separated by separator(, for example).
- each event is bounded by the special markers(L and I for example)
- end marker is previous element of event-sequence
static string[] SplitLines(string stream, string startSeq, string endLine, string separator) {
string[] elements = stream.Split(new string[] { separator }, StringSplitOptions.RemoveEmptyEntries);
int pos = 0;
List<string> line = new List<string>();
List<string> lines = new List<string>();
State state = State.SeqStart;
while(pos < elements.Length) {
string current = elements[pos].Trim();
switch(state) {
case State.SeqStart:
if(current == startSeq)
state = State.LineStart;
continue;
case State.LineStart:
if(++pos < elements.Length) {
line.Add(startSeq);
state = State.Line;
}
continue;
case State.Line:
if(current == endLine)
state = State.LineEnd;
else
line.Add(current);
pos++;
continue;
case State.LineEnd:
line.Add(endLine);
line.Add(current);
lines.Add(string.Join(separator, line));
line.Clear();
state = State.LineStart;
continue;
}
}
return lines.ToArray();
}
enum State { SeqStart, LineStart, Line, LineEnd };
f you want to split the string into multiple strings, you need a set of rules, which are implementable. In your case i would start splitting the complete string by the given comma , and than go though the elements in a loop. All the strings in the loop will be appended in a StringBuilder. If your ruleset say you need a new line, just add it via yourBuilder.Append('\r\n') or use AppendLine.

split text in a text file

How can I split a text file where I have various length of sentences inside and I want to read the text file when I click to button1 on my form and take, extract words from that text file that are between start and the end of ' character and which contains # symbol or # symbol inside the start and end of ' character and I want to know which line is it in and output the words into the text file.
Example, lets say I have a text like
abc'123'#def'456''#ghi'
abc'123'#def'#456''#ghi'123456'
output:
1st sentence #ghi
2nd sentence #456 #ghi
PS: #def is not in start and end of ' character so not in the output
I tied with split function but couldn't make it and turned into mass: ( How can I make this. I will be pleased if someone who knows helps.
Thanks.
here ur input string is s & the string contains # or # at first index is str
int start = s.indexOf("'");
int end = s.indexOf("'", start + 1);
string str = s.SubString(start, end);
if(str.ToCharArray()[0] == "#" || str.ToCharArray()[0] == "#")
// proceed
As far as this example is concerned here is a sample code that works
string sen1="abc'123'#def'456''#ghi'";
string sen2 = "abc'123'#def'#456''#ghi'123456'";
string[] NewSen = Regex.Split(sen1, "''");
string YourFirstOP=NewSen[1].ToString(); //gets #ghi
NewSen = Regex.Split(sen2, "''");
string[] A1 = Regex.Split(NewSen[0], "'");
string[] A2 = Regex.Split(NewSen[1], "'");
string YourSecondOP= A1[A1.Length - 1] + "" + A2[A2.Length - 3].ToString();// gets #456 #ghi
But thats just this example
Hope this helps
Try this,
string testString = #"abc'123'#def'456''#ghi'abc'123'#def'#456''#ghi'123456'";
List<string> output = new List<string>();
int startIndex = 0;
int endIndex = 0;
while (startIndex >= 0 && endIndex >= 0)
{
startIndex = testString.IndexOf("'", endIndex + 1);
endIndex = testString.IndexOf("'", startIndex + 1);
if (startIndex >= 0 && endIndex >= 0)
{
string str = testString.Substring(startIndex + 1, (endIndex - startIndex) - 1);
int indexOfSpecialChar = str.IndexOf("#");
if (indexOfSpecialChar < 0)
{
indexOfSpecialChar = str.IndexOf("#");
}
if (indexOfSpecialChar >= 0)
{
output.Add(str.Substring(indexOfSpecialChar));
}
}
}
string [] Mass = s.Split('\'');
if (Mass.Length > 1)
for (int i = 1; i < (Mass.Length - 1); i += 2)
{
if (Mass[i].Contains("#") || Mass[i].Contains("#"))
// proceed
}

Find string in Haystack and display that particular paragraph where the string was found

I have an sql resultset which is retrieved after searching through the database using the LIKE keyword. I want to display the result on a page but without showing the whole text. Just the paragraph where the result was found. Maybe even put that particular word in bold. Anyone with an idea of how best I can implement this?
Get the text into a string.
Split on your paragraph character (line break?) - text.split('\n')
Iterate over each paragraph
Get the index(es) of your keyword - text.IndexOf("keyword")
Then perform some logic to cut number of characters at the start and end
Insert bold tag with for example a string replace - text = text.Replace("keyword", "<b>keyword</b>")
[Edit - added code sample]
public List<string> HighLightedParagraphs(string word, string text)
{
int charBeforeAndAfter = 100;
List<string> matchParagraphs = new List<string>();
Regex wordMatch = new Regex(#"\b" + word + #"\b", RegexOptions.IgnoreCase);
foreach (string paragraph in text.Split(new[] { '\r', '\n' }, StringSplitOptions.RemoveEmptyEntries))
{
int startIdx = -1;
int length = -1;
foreach (Match match in wordMatch.Matches(paragraph))
{
int wordIdx = match.Index;
if (wordIdx >= startIdx && wordIdx <= startIdx + length) continue;
startIdx = wordIdx > charBeforeAndAfter ? wordIdx - charBeforeAndAfter : 0;
length = wordIdx + match.Length + charBeforeAndAfter < paragraph.Length
? match.Length + charBeforeAndAfter
: paragraph.Length - startIdx;
string extract = wordMatch.Replace(paragraph.Substring(startIdx, length), "<b>" + match.Value + "</b>");
matchParagraphs.Add("..." + extract + "...");
}
}
return matchParagraphs;
}

What is the most efficient way to detect if a string contains a number of consecutive duplicate characters in C#?

For example, a user entered "I love this post!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!"
the consecutive duplicate exclamation mark "!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!" should be detected.
The following regular expression would detect repeating chars. You could up the number or limit this to specific characters to make it more robust.
int threshold = 3;
string stringToMatch = "thisstringrepeatsss";
string pattern = "(\\d)\\" + threshold + " + ";
Regex r = new Regex(pattern);
Match m = r.Match(stringToMatch);
while(m.Success)
{
Console.WriteLine("character passes threshold " + m.ToString());
m = m.NextMatch();
}
Here's and example of a function that searches for a sequence of consecutive chars of a specified length and also ignores white space characters:
public static bool HasConsecutiveChars(string source, int sequenceLength)
{
if (string.IsNullOrEmpty(source))
return false;
if (source.Length == 1)
return false;
int charCount = 1;
for (int i = 0; i < source.Length - 1; i++)
{
char c = source[i];
if (Char.IsWhiteSpace(c))
continue;
if (c == source[i+1])
{
charCount++;
if (charCount >= sequenceLength)
return true;
}
else
charCount = 1;
}
return false;
}
Edit fixed range bug :/
Can be done in O(n) easily: for each character, if the previous character is the same as the current, increment a temporary count. If it's different, reset your temporary count. At each step, update your global if needed.
For abbccc you get:
a => temp = 1, global = 1
b => temp = 1, global = 1
b => temp = 2, global = 2
c => temp = 1, global = 2
c => temp = 2, global = 2
c => temp = 3, global = 3
=> c appears three times. Extend it to get the position, then you should be able to print the "ccc" substring.
You can extend this to give you the starting position fairly easily, I'll leave that to you.
Here is a quick solution I crafted with some extra duplicates thrown in for good measure. As others pointed out in the comments, some duplicates are going to be completely legitimate, so you may want to narrow your criteria to punctuation instead of mere characters.
string input = "I loove this post!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!aa";
int index = -1;
int count =1;
List<string> dupes = new List<string>();
for (int i = 0; i < input.Length-1; i++)
{
if (input[i] == input[i + 1])
{
if (index == -1)
index = i;
count++;
}
else if (index > -1)
{
dupes.Add(input.Substring(index, count));
index = -1;
count = 1;
}
}
if (index > -1)
{
dupes.Add(input.Substring(index, count));
}
The better way i my opinion is create a array, each element in array is responsible for one character pair on string next to each other, eg first aa, bb, cc, dd. This array construct with 0 on each element.
Solve of this problem is a for on this string and update array values.
You can next analyze this array for what you want.
Example: For string: bbaaaccccdab, your result array would be { 2, 1, 3 }, because 'aa' can find 2 times, 'bb' can find one time (at start of string), 'cc' can find three times.
Why 'cc' three times? Because 'cc'cc & c'cc'c & cc'cc'.
Use LINQ! (For everything, not just this)
string test = "aabb";
return test.Where((item, index) => index > 0 && item.Equals(test.ElementAt(index)));
// returns "abb", where each of these items has the previous letter before it
OR
string test = "aabb";
return test.Where((item, index) => index > 0 && item.Equals(test.ElementAt(index))).Any();
// returns true

Categories

Resources