Unable to extract a substring from a string - c#

I am long string array and i want to pass it to another function in the chunks of 250 characters one time, i have written this code:
var cStart = 0;
var phase = 250;
var cEnd = cStart + phase;
var count = 0;
while (count < 10000)
{
string fileInStringTemp = "";
fileInStringTemp = fileInString.Substring(cStart, cEnd);
var lngth = fileInStringTemp.Length;
//Do Some Work
cStart += phase;
cEnd += phase;
count++;
}
In the first iteration of the loop the value of lngth is 250 which is fine, in the next iteration i also want it to 250 because i am extracting substring from 250-500 characters but shockingly the value of lngth variable in the second iteration gets 500.
Why is that? i am also trying to initialize string variable everytime in the loop so it starts from zero but no gain.

Substring's second parameter is the length you want, not the stop index.
public string Substring(
int startIndex,
int length
)
So, all you need to do is change your code to have the start index and length (phase)
fileInString.Substring(cStart, phase)

Here is the MSDN link about how to work with Substring:
https://msdn.microsoft.com/en-us/library/aka44szs(v=vs.110).aspx
According to MSDN first parameter in Substring method is StartIndex which is defined as The zero-based starting character position of a substring and second parameter is used to define lenght of substring which is defined as The number of characters in the substring.
So you should try this:
var cStart = 0;
var phase = 250;
var count = 0;
while (count < 10000)
{
string fileInStringTemp = "";
fileInStringTemp = fileInString.Substring(cStart, phase);
var lngth = fileInStringTemp.Length;
//Do Some Work
count++;
cStart = phase * count + 1;
}

Try changing
fileInStringTemp = fileInString.Substring(cStart, cEnd);
to
fileInStringTemp = fileInString.Substring(cStart, cPhase);

The 2nd parameter to your SubString() method is the length of the substring to return. (You should be able to always use 250 and just keep shifting your starting point - the 1st param - until you are done.)

Substring has the parameters (startIndex, count) so you are not aloud to increment end. better change to Substring(cStart, phase)

Related

How to swap huge string

I have huge string e.g (This is just the part, the string looks same it has just a bit different values).
string numbers = "48.7465504247904 9.16364437205161 48.7465666577545 9.16367275419435 48.746927738083 9.16430761814855 48.7471066512883 9.16462219521963 48.7471147950429";
So I have to swap the whole number e.g.
Output should be:
9.16364437205161 48.7465504247904
Also I need to swap the first and second part.
So I've tried to split the string, and then to replace the old one with the new one.
string numbers = "48.7465504247904 9.16364437205161 48.7465666577545 9.16367275419435 48.746927738083 9.16430761814855 48.7471066512883 9.16462219521963 48.7471147950429";
string output = "";
double first = 0;
double second = 0;
for (int i = 0; i < numbers.Length; i++)
{
numbers.Split(' ');
first = numbers[0];
second = numbers[1];
}
output = numbers.Replace(numbers[1], numbers[0]);
Console.WriteLine(output);
But my variable first always after the loop has the value 52.
Right now my output is: 44.7465504247904 9.16364437205161, it changed the first part, also it calculates somehow -4
.
You're not assigning anything to the value coming back from .Split and, if I read this right, you're also iterating each character in the numbers array for unclear reasons.
Using .Split is all you need ... well, and System.Linq
using System.Linq;
// ...
string SwapNumbers(string numbers) {
return numbers.Split(' ').Reverse().Join();
}
The above assumes you want to reverse the whole series of numbers. It absolutely does not swap 1,2 then swap 3,4 etc. If that's what you're looking for, it's a bit more involved and I'll add that in a second for funsies.
string SwapAlternateNumbers(string numbersInput) {
var wholeSeries = numbersInput.Split(' ').ToList();
// odd number of inputs
if (wholeSeries.Count % 2 != 0) {
throw new InvalidOperationException("I'm not handling this use case for you.");
}
var result = new StringBuilder();
for(var i = 0; i < wholeSeries.Count - 1; i += 2) {
// append the _second_ number
result.Append(wholeSeries[i+1]).Append(" ");
// append the _first_ number
result.Append(wholeSeries[i]).Append(" ");
}
// assuming you want the whole thing as a string
return result.ToString();
}
Edit: converted back to input and output string. Sorry about the enumerables; that's a difficult habit to break.
here
public static void Main()
{
string nums = "48.7465504247904 9.16364437205161 48.7465504247904 9.16364437205161";
var numbers = nums.Split(' ');
var swapped = numbers.Reverse();
Console.WriteLine("Hello World {"+string.Join(" ",swapped.ToArray())+"}");
}

count the number of points at the end of a string

I need to count the number of points at the END of string.
The number of points in the middle of the string are not relevant and should not be countet.
How can this be done?
string sample = "This.is.a.sample.string.....";
for the example above the correct answer would be 5 because there are 5 points at the end of the string.
because of performace reasons I would prefer a fast solution. Don't know if Regular Expressions
\.*$
should be used in such a case.
Start from the end of the string and go back char by char until its not a dot:
string sample = "This.is.a.sample.string....."
int count = 0;
for (int i = sample.Length - 1; i >= 0; i--)
{
if (sample[i] != '.') break;
count++;
}
Using Linq:
var test = "this.is.a.test........";
var count = test.ToCharArray().Reverse().TakeWhile(q => q == '.').Count();
Convert string to array, reverse, then take while character = '.'. Count result.
A simple solution using an extension method.
var test = "this.is.a.test........";
Console.WriteLine(test.CountTrailingDots());
public static int CountTrailingDots(this string value)
{
return value.Length - value.TrimEnd('.').Length;
}
Using Regex:
int points = Regex.Match("This.is.a.sample.string....", #"^[\w\W]*?([.]*+)$").Groups[1].Value.Length;
Description:
*+ = Matches as many characters as possible
*? = Matches as few characters as possible.
It can be something like..
string sample = "This.is.a.sample.string.....";
int count = 0;
if(sample.EndsWith("."))
count = sample.Substring(sample.TrimEnd('.').Length).Length;

How can i parse text from string and add it to a List<string> using indexof and substring?

The code:
int index = 0;
List<string> Names = new List<string>();
while (index != -1)
{
string firstTag = "a title";
string endTag = "href";
string forums = webBrowser1.DocumentText;
index = forums.IndexOf(firstTag);
int index1 = forums.IndexOf(endTag, index);
string Count = forums.Substring(index + 9, ((index1 - 35) - index));
Names.Add(forumsCount);
}
In this case i want to use indexof and substring.
The way i did it now i'm getting endless loop and very large List Names and all the Names inside is the same one the index is never move forward.
Looks like you never move forward the starting point. You need to use IndexOf(String, Int32) when getting the first index and specify where to start the search, otherwise you'll just keep getting the same result.
Something like this:
const string openingTag = "a title=\"";
const string closingTag = "\" href";
var html = " sadsffdaf a title=\"מכבי תאמכ\" href, a title=\" תאמכ\" href, a title=\"מכבי \" href";
var names = new List<string>();
var index = 0;
var previousIndex = 0;
while (index > -1)
{
index = html.IndexOf(openingTag, previousIndex);
if (index == -1)
continue;
var secondIndex = html.IndexOf(closingTag, index);
var result = html.Substring(index + openingTag.Length, secondIndex - (index + openingTag.Length));
names.Add(result);
previousIndex = index + 1;
}
EDIT: I updated code to include an example HTML string I tested against as per your comment.
I also updated the substring to get the text between the two tags. I assume this is what you want to do?
Also, in your question you're taking the first index from 'nums' and the second tag from 'forums'. I'm guessing this was a typo?
I'm not sure I can help any further without seeing the actual HTML you are parsing.

erroneous character fixing of strings in c#

I have five strings like below,
ABBCCD
ABBDCD
ABBDCD
ABBECD
ABBDCD
all the strings are basically same except for the fourth characters. But only the character that appears maximum time will take the place. For example here D was placed 3 times in the fourth position. So, the final string will be ABBDCD. I wrote following code, but it seemed to be less efficient in terms of time. Because this function can be called million times. What should I do to improve the performance?
Here changedString is the string to be matched with other 5 strings. If Any position of the changed string is not matched with other four, then the maxmum occured character will be placed on changedString.
len is the length of the strings which is same for all strings.
for (int i = 0; i < len;i++ )
{
String findDuplicate = string.Empty + changedString[i] + overlapStr[0][i] + overlapStr[1][i] + overlapStr[2][i] +
overlapStr[3][i] + overlapStr[4][i];
char c = findDuplicate.GroupBy(x => x).OrderByDescending(x => x.Count()).First().Key;
if(c!=changedString[i])
{
if (i > 0)
{
changedString = changedString.Substring(0, i) + c +
changedString.Substring(i + 1, changedString.Length - i - 1);
}
else
{
changedString = c + changedString.Substring(i + 1, changedString.Length - 1);
}
}
//string cleanString = new string(findDuplicate.ToCharArray().Distinct().ToArray());
}
I'm not quite sure what you are going to do, but if it is about sorting strings by some n-th character, then the best way is to use Counting Sort http://en.wikipedia.org/wiki/Counting_sort It is used for sorting array of small integers and is quite fine for chars. It has linear O(n) time. The main idea is that if you know all your possible elements (looks like they can be only A-Z here) then you can create an additional array and count them. For your example it will be {0, 0, 1 ,3 , 1, 0,...} if we use 0 for 'A', 1 for 'B' and so on.
There is a function that might help performance-wise as it runs five times faster. The idea is to count occurrences yourself using a dictionary to convert character to a position into counting array, increment value at this position and check if it is greater than previously highest number of occurrences. If it is, current character is top and is stored as result. This repeats for each string in overlapStr and for each position within the strings. Please read comments inside code to see details.
string HighestOccurrenceByPosition(string[] overlapStr)
{
int len = overlapStr[0].Length;
// Dictionary transforms character to offset into counting array
Dictionary<char, int> char2offset = new Dictionary<char, int>();
// Counting array. Each character has an entry here
int[] counters = new int[overlapStr.Length];
// Highest occurrence characters found so far
char[] topChars = new char[len];
for (int i = 0; i < len; ++i)
{
char2offset.Clear();
// faster! char2offset = new Dictionary<char, int>();
// Highest number of occurrences at the moment
int highestCount = 0;
// Allocation of counters - as previously unseen character arrives
// it is given a slot at this offset
int lastOffset = 0;
// Current offset into "counters"
int offset = 0;
// Small optimization. As your data seems very similar, this helps
// to reduce number of expensive calls to TryGetValue
// You might need to remove this optimization if you don't have
// unused value of char in your dataset
char lastChar = (char)0;
for (int j = 0; j < overlapStr.Length; ++ j)
{
char thisChar = overlapStr[j][i];
// If this is the same character as last one
// Offset already points to correct cell in "counters"
if (lastChar != thisChar)
{
// Get offset
if (!char2offset.TryGetValue(thisChar, out offset))
{
// First time seen - allocate & initialize cell
offset = lastOffset;
counters[offset] = 0;
// Map character to this cell
char2offset[thisChar] = lastOffset++;
}
// This is now last character
lastChar = thisChar;
}
// increment and get count for character
int charCount = ++counters[offset];
// This is now highestCount.
// TopChars receives current character
if (charCount > highestCount)
{
highestCount = charCount;
topChars[i] = thisChar;
}
}
}
return new string(topChars);
}
P.S. This is certainly not the best solution. But as it is significantly faster than original I thought I should help out.

c# getting a string within another string

i have a string like this:
some_string = "A simple demo of SMS text messaging.\r\n+CMGW: 3216\r\n\r\nOK\r\n\"
im coming from vb.net and i need to know in c#, if i know the position of CMGW, how do i get "3216" out of there?
i know that my start should be the position of CMGW + 6, but how do i make it stop as soon as it finds "\r" ??
again, my end result should be 3216
thank you!
Find the index of \r from the start of where you're interested in, and use the Substring overload which takes a length:
// Production code: add validation here.
// (Check for each index being -1, meaning "not found")
int cmgwIndex = text.IndexOf("CMGW: ");
// Just a helper variable; makes the code below slightly prettier
int startIndex = cmgwIndex + 6;
int crIndex = text.IndexOf("\r", startIndex);
string middlePart = text.Substring(startIndex, crIndex - startIndex);
If you know the position of 3216 then you can just do the following
string inner = some_string.SubString(positionOfCmgw+6,4);
This code will take the substring of some_string starting at the given position and only taking 4 characters.
If you want to be more general you could do the following
int start = positionOfCmgw+6;
int endIndex = some_string.IndexOf('\r', start);
int length = endIndex - start;
string inner = some_string.SubString(start, length);
One option would be to start from your known index and read characters until you hit a non-numeric value. Not the most robust solution, but it will work if you know your input's always going to look like this (i.e., no decimal points or other non-numeric characters within the numeric part of the string).
Something like this:
public static int GetNumberAtIndex(this string text, int index)
{
if (index < 0 || index >= text.Length)
throw new ArgumentOutOfRangeException("index");
var sb = new StringBuilder();
for (int i = index; i < text.Length; ++i)
{
char c = text[i];
if (!char.IsDigit(c))
break;
sb.Append(c);
}
if (sb.Length > 0)
return int.Parse(sb.ToString());
else
throw new ArgumentException("Unable to read number at the specified index.");
}
Usage in your case would look like:
string some_string = #"A simple demo of SMS text messaging.\r\n+CMGW: 3216\r\n...";
int index = some_string.IndexOf("CMGW") + 6;
int value = some_string.GetNumberAtIndex(index);
Console.WriteLine(value);
Output:
3216
If you're looking to extract the number portion of 'CMGW: 3216' then a more reliable method would be to use regular expressions. That way you can look for the entire pattern, and not just the header.
var some_string = "A simple demo of SMS text messaging.\r\n+CMGW: 3216\r\n\r\nOK\r\n";
var match = Regex.Match(some_string, #"CMGW\: (?<number>[0-9]+)", RegexOptions.Multiline);
var number = match.Groups["number"].Value;
More general, if you don't know the start position of CMGW but the structure remains as before.
String s;
char[] separators = {'\r'};
var parts = s.Split(separators);
parts.Where(part => part.Contains("CMGW")).Single().Reverse().TakeWhile(c => c != ' ').Reverse();

Categories

Resources