Splitting string into an array on 16th char [duplicate] - c#

This question already has answers here:
Splitting a string into chunks of a certain size
(39 answers)
Split string after certain character count
(4 answers)
Closed 8 years ago.
I have a text file with various 16 char strings both appended to one another and on separate lines. I've done this
FileInfo f = new FileInfo("d:\\test.txt");
string FilePath = ("d:\\test.txt");
string FileText = new System.IO.StreamReader(FilePath).ReadToEnd().Replace("\r\n", "");
CharCount = FileText.Length;
To remove all of the new lines and create one massively appended string. I need now to split this massive string into an array. I need to split it up on the consecutive 16th char until the end. Can anyone guide me in the right direction? I've taken a look at various methods in String such as Split and in StreamReader but am confused as to what the best way to go about it would be. I'm sure it's simple but I can't figure it out.
Thank you.

Adapting the answer from here:
You could try something like so:
string longstr = "thisisaverylongstringveryveryveryveryverythisisaverylongstringveryveryveryveryvery";
IEnumerable<string> splitString = Regex.Split(longstr, "(.{16})").Where(s => s != String.Empty);
foreach (string str in splitString)
{
System.Console.WriteLine(str);
}
Yields:
thisisaverylongs
tringveryveryver
yveryverythisisa
verylongstringve
ryveryveryveryve
ry

One possible solution could look like this (extracted as extension method and made dynamic, in case different token size is needed and no hard-coded dependencies):
public static class ProjectExtensions
{
public static String[] Chunkify(this String input, int chunkSize = 16)
{
// result
var output = new List<String>();
// temp helper
var chunk = String.Empty;
long counter = 0;
// tokenize to 16 chars
input.ToCharArray().ToList().ForEach(ch =>
{
counter++;
chunk += ch;
if ((counter % chunkSize) == 0)
{
output.Add(chunk);
chunk = String.Empty;
}
});
// add the rest
output.Add(chunk);
return output.ToArray();
}
}
The standard usage (16 chars) looks like this:
// 3 inputs x 16 characters and 1 x 10 characters
var input = #"1234567890ABCDEF1234567890ABCDEF1234567890ABCDEF1234567890";
foreach (var chunk in input.Chunkify())
{
Console.WriteLine(chunk);
}
The output is:
1234567890ABCDEF
1234567890ABCDEF
1234567890ABCDEF
1234567890
Usage with different token size:
foreach (var chunk in input.Chunkify(13))
{
Console.WriteLine(chunk);
}
and the corresponding output:
1234567890ABC
DEF1234567890
ABCDEF1234567
890ABCDEF1234
567890
It is not a fancy solution (and could propably be optimised for speed), but it works and it is easy to understand and implement.

Create a list to hold your tokens. Then get subsequent substrings of length 16 and add them to the list.
List<string> tokens = new List<string>();
for(int i=0; i+16<=FileText.Length; i+=16) {
tokens.Add(FileText.Substring(i,16));
}
As mentioned in the comments, this ignores the last token if it has less than 16 characters. If you want it anyway you can write:
List<string> tokens = new List<string>();
for(int i=0; i<FileText.Length; i+=16) {
int len = Math.Min(16, FileText.Length-i));
tokens.Add(FileText.Substring(i,len);
}

Please try this method. I haven't tried it , but used it once.
int CharCount = FileText.Length;
int arrayhold = (CharCount/16)+2;
int count=0;
string[] array = new string[arrayhold];
for(int i=0; i<FileText.Length; i+=16)
{
int currentleft = FileText.Length-(16*count);
if(currentleft>16)
{
array[count]=FileText.Substring(i,16);
}
if(currentleft<16)
{
array[count]=FileText.Substring(i,currentleft);
}
count++;
}
This is the new code and provide a TRUE leftovers handling. Tested in ideone
Hope it works

Try this one:
var testArray = "sdfsdfjshdfalkjsdfhalsdkfjhalsdkfjhasldkjfhasldkfjhasdflkjhasdlfkjhasdlfkjhasdlfkjhasldfkjhalsjfdkhklahjsf";
var i = 0;
var query = from s in testArray
let num = i++
group s by num / 16 into g
select new {Value = new string(g.ToArray())};
var result = query.Select(x => x.Value).ToList();
result is List containing all the 16 char strings.

Related

Detecting and modifying ListBox entries that contain digits

My program has about 25 entries, most of them string only. However, some of them are supposed to have digits in them, and I don't need those digits in the output (output should be string only). So, how can I "filter out" integers from strings?
Also, if I have integers, strings AND chars, how could I do it (for example, one ListBox entry is E#2, and should be renamed to E# and then printed as output)?
Assuming that your entries are in a List<string>, you can loop through the list and then through each character of each entry, then check if it is a number and remove it. Something like this:
List<string> list = new List<string>{ "abc123", "xxx111", "yyy222" };
for (int i = 0; i < list.Count; i++) {
var no_numbers = "";
foreach (char c in list[i]) {
if (!Char.IsDigit(c))
no_numbers += c;
}
list[i] = no_numbers;
}
This only removes digits as it seems you wanted from your question. If you want to remove all other characters except letters, you can change the logic a bit and use Char.IsLetter() instead of Char.IsDigit().
You can remove all numbers from a strings with this LINQ solution:
string numbers = "Ho5w ar7e y9ou3?";
string noNumbers = new string(numbers.Where(c => !char.IsDigit(c)).ToArray());
noNumbers = "How are you?"
But you can also remove all numbers from a string by using a foreach loop :
string numbers = "Ho5w ar7e y9ou3?";
List<char> noNumList = new List<char>();
foreach (var c in numbers)
{
if (!char.IsDigit(c))
noNumList.Add(c);
}
string noNumbers = string.Join("", noNumList);
If you want to remove all numbers from strings inside a collection :
List<string> myList = new List<string>() {
"Ho5w ar7e y9ou3?",
"W9he7re a3re y4ou go6ing?",
"He2ll4o!"
};
List<char> noNumList = new List<char>();
for (int i = 0; i < myList.Count; i++)
{
foreach (var c in myList[i])
{
if(!char.IsDigit(c))
noNumList.Add(c);
}
myList[i] = string.Join("", noNumList);
noNumList.Clear();
}
myList Output :
"How are you?"
"Where are you going?"
"Hello!"
I don't know exactly what is your scenario, but given a string, you can loop through its characters, and if it's a number, discard it from output.
Maybe this is what you're looking for:
string entry = "E#2";
char[] output = new char[entry.Length];
for(int i = 0, j =0; i < entry.Length ; i++)
{
if(!Char.IsDigit(entry[i]))
{
output[j] = entry[i];
j++;
}
}
Console.WriteLine(output);
I've tried to give you a simple solution with one loop and two index variables, avoiding string concatenations that can make performance lacks.
See this example working at C# Online Compiler
If i am not wrong,maybe this is how your list looks ?
ABCD123
EFGH456
And your expected output is :
ABCD
EFGH
Is that correct?If so,assuming that it's a List<string>,then you can use the below code :
list<string> mylist = new list<string>;
foreach(string item in mylist)
{
///To get letters/alphabets
var letters = new String(item.Where(Char.IsLetter).ToArray());
///to get special characters
var letters = new String(item.Where(Char.IsSymbol).ToArray())
}
Now you can easily combine the codes :)

Get a range from two integers

string num = db.SelectNums(id);
string[] numArr = num.Split('-').ToArray();
string num contain for an example "48030-48039";
string[] numArr will therefor contain (48030, 48039).
Now I have two elements, a high and low. I now need to get ALL the numbers from 48030 to 48039. The issue is that it has to be string since there will be telephone numbers with leading zeroes now and then.
Thats why I cannot use Enumerable.Range().ToArray() for this.
Any suggestions? The expected result should be 48030, 48031, 48032, ... , 48039
This should work with your leading zero requirement:
string num = db.SelectNums(id);
string[] split = num.Split('-');
long start = long.Parse(split[0]);
long end = long.Parse(split[1]);
bool includeLeadingZero = split[0].StartsWith("0");
List<string> results = new List<string>();
for(int i = start; i <= end; i++)
{
string result = includeLeadingZero ? "0" : "";
result += i.ToString();
results.Add(result);
}
string[] arrayResults = results.ToArray();
A few things to note:
It assumes your input will be 2 valid integers split by a single hyphen
I am giving you a string array result, personally I would prefer to work with a List<int> in the end
If the first number contains a single leading zero, then all results will contain a single leading zero
It uses long to cater for longer numbers, beware that the max number that will parse is 9,223,372,036,854,775,807, you could double this value (not length) by using ulong
Are you saying this?
int[] nums = new int[numArr.Length];
for (int i = 0; i < numArr.Length; i++)
{
nums[i] = Convert.ToInt32(numArr[i]);
}
Array.Sort(nums);
for (int n = nums[0]; n <= nums[nums.Length - 1]; n++)
{
Console.WriteLine(n);
}
here link
I am expecting your string always have valid two integers, so using Parse instead TryParse
string[] strList = "48030-48039".Split('-').ToArray();
var lst = strList.Select(int.Parse).ToList();
var min = lst.OrderBy(l => l).FirstOrDefault();
var max = lst.OrderByDescending(l => l).FirstOrDefault();
var diff = max - min;
//adding 1 here otherwise 48039 will not be there
var rng = Enumerable.Range(min,diff+1);
If you expecting invalid string
var num = 0;
var lst = (from s in strList where int.TryParse(s, out num) select num).ToList();
This is one way to do it:
public static string[] RangeTest()
{
Boolean leadingZero = false;
string num = "048030-48039"; //db.SelectNums(id);
if (num.StartsWith("0"))
leadingZero = true;
int min = int.Parse(num.Split('-').Min());
int count = int.Parse(num.Split('-').Max()) - min;
if (leadingZero)
return Enumerable.Range(min, count).Select(x => "0" + x.ToString()).ToArray();
else
return Enumerable.Range(min, count).Select(x => "" + x.ToString()).ToArray(); ;
}
You can use string.Format to ensure numbers are formatted with leading zeros. That will make the method work with arbitrary number of leading zeros.
private static string CreateRange(string num)
{
var tokens = num.Split('-').Select(s => s.Trim()).ToArray();
// use UInt64 to allow huge numbers
var start = UInt64.Parse(tokens[0]);
var end = UInt64.Parse(tokens[1]);
// if your start number is '000123', this will create
// a format string with 6 zeros ('000000')
var format = new string('0', tokens[0].Length);
// use StringBuilder to make GC happy.
// (if only there was a Enumerable.Range<ulong> overload...)
var sb = new StringBuilder();
for (var i = start; i <= end; i++)
{
// format ensures that your numbers are padded properly
sb.Append(i.ToString(format));
sb.Append(", ");
}
// trim trailing comma after the last element
if (sb.Length >= 2) sb.Length -= 2;
return sb.ToString();
}
Usage example:
// prints 0000012, 0000013, 0000014
Console.WriteLine( CreateRange("0000012-0000014") );
Three significant issues were brought up in comments:
The phone numbers have enough digits to exceed Int32.MaxValue so
converting to int isn't viable.
The phone numbers can have leading zeros (presumeably for some
international calling?)
The possible range of numbers can theoretically exceed the maximum size of an array (which may have memory issues, and I think may not be represented as a string)
As such, you may need to use long instead of int, and I would suggest using deferred execution if needed for very large ranges.
public static IEnumerable<string> EnumeratePhoneNumbers(string fromAndTo)
{
var split = fromAndTo.Split('-');
return EnumeratePhoneNumbers(split[0], split[1]);
}
public static IEnumerable<string> EnumeratePhoneNumbers(string fromString, string toString)
{
long from = long.Parse(fromString);
long to = long.Parse(toString);
int totalNumberLength = fromString.Length;
for (long phoneNumber = from; phoneNumber <= to; phoneNumber++)
{
yield return phoneNumber.ToString().PadLeft(totalNumberLength, '0');
}
}
This assumes that the padded zeros are already included in the lower bound fromString text. It will iterate and yield out numbers as you need them. This can be useful if you're churning out a lot of numbers and don't need to fill up memory with them, or if you just need the first 10 or 100. For example:
var first100Numbers =
EnumeratePhoneNumbers("0018155500-7018155510")
.Take(100)
.ToArray();
Normally that range would produce 7 billion results which cannot be stored in an array, and might run into memory issues (I'm not even sure if it can be stored in a string); by using deferred execution, you only create the 100 needed.
If you do have a small range, you can still join up your results into a string as you desired:
string numberRanges = String.Join(", ", EnumeratePhoneNumbers("0018155500-0018155510"));
And naturally you can put this array creation into your own helper method:
public static string GetPhoneNumbersListing(string fromAndTo)
{
return String.Join(", ", EnumeratePhoneNumbers("0018155500-0018155510"));
}
So your usage would be:
string numberRanges = GetPhoneNumbersListing("0018155500-0018155510");
A complete solution inspired by the answer by #Dan-o:
Inputs:
Start: 48030
End: 48039
Digits: 6
Expected String Output:
048030, 048031, 048032, 048033, 048034, 048035, 048036, 048037, 048038, 048039
Program:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
public class Program
{
public static void Main()
{
int first = 48030;
int last = 48039;
int digits = 6;
Console.WriteLine(CreateRange(first, last, digits));
}
public static string CreateRange(int first, int last, int numDigits)
{
string separator = ", ";
var sb = new StringBuilder();
sb.Append(first.ToString().PadLeft(numDigits, '0'));
foreach (int num in Enumerable.Range(first + 1, last - first))
{
sb.Append(separator + num.ToString().PadLeft(numDigits, '0'));
}
return sb.ToString();
}
}
For Each item In Enumerable.Range(min, count).ToArray()
something = item.PadLeft(5, "0")
Next

SubString editing

I've tried a few different methods and none of them work correctly so I'm just looking for someone to straight out show me how to do it . I want my application to read in a file based on an OpenFileDialog.
When the file is read in I want to go through it and and run this function which uses Linq to insert the data into my DB.
objSqlCommands.sqlCommandInsertorUpdate
However I want to go through the string , counting the number of ","'s found . when the number reaches four I want to only take the characters encountered until the next "," and do this until the end of the file .. can someone show me how to do this ?
Based on the answers given here my code now looks like this
string fileText = File.ReadAllText(ofd.FileName).Replace(Environment.NewLine, ",");
int counter = 0;
int idx = 0;
List<string> foo = new List<string>();
foreach (char c in fileText.ToArray())
{
idx++;
if (c == ',')
{
counter++;
}
if (counter == 4)
{
string x = fileText.Substring(idx);
foo.Add(fileText.Substring(idx, x.IndexOf(',')));
counter = 0;
}
}
foreach (string s in foo)
{
objSqlCommands.sqlCommandInsertorUpdate("INSERT", s);//laClient[0]);
}
However I am getting an "length cannot be less than 0" error on the foo.add function call , any ideas ?
A Somewhat hacky example. You would pass this the entire text from your file as a single string.
string str = "1,2,3,4,i am some text,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20";
int counter = 0;
int idx = 0;
List<string> foo = new List<string>();
foreach (char c in str.ToArray())
{
idx++;
if (c == ',')
{
counter++;
}
if (counter == 4)
{
string x = str.Substring(idx);
foo.Add(str.Substring(idx, x.IndexOf(',')));
counter = 0;
}
}
foreach(string s in foo)
{
Console.WriteLine(s);
}
Console.Read();
Prints:
i am some text
9
13
17
As Raidri indicates in his answer, String.Split is definitely your friend. To catch every fifth word, you could try something like this (not tested):
string fileText = File.ReadAllText(OpenDialog.FileName).Replace(Environment.NewLine, ",");
string words[] = fileText.Split(',');
List<string> everFifthWord = new List<string>();
for (int i = 4; i <= words.Length - 1, i + 5)
{
everyFifthWord.Add(words[i]);
}
The above code reads the selected file from the OpenFileDialog, then replaces every newline with a ",". Then it splits the string on ",", and starting with the fifth word takes every fifth word in the string and adds it to the list.
File.ReadAllText reads a text file to a string and Split turns that string into an array seperated at the commas:
File.ReadAllText(OpenDialog.FileName).Split(',')[4]
If you have more than one line use:
File.ReadAllLines(OpenDialog.FileName).Select(l => l.Split(',')[4])
This gives an IEnumerable<string> where each string contains the wanted part from one line of the file
It's not clear to me if you're after every fifth piece of text between the commas or if there are multiple lines and you want only the fifth piece of text on each line. So I've done both.
Every fifth piece of text:
var text = "1,2,3,4,i am some text,6,7,8,9"
+ ",10,11,12,13,14,15,16,17,18,19,20";
var everyFifth =
text
.Split(',')
.Where((x, n) => n % 5 == 4);
Only the fifth piece of text on each line:
var lines = new []
{
"1,2,3,4,i am some text,6,7",
"8,9,10,11,12,13,14,15",
"16,17,18,19,20",
};
var fifthOnEachLine =
lines
.Select(x => x.Split(',')[4]);

Reading a series of numbers from a file in C#

I currently have a method that reads ints from a file and upon reaching 16 ints a regex is run against the string (register). Although the way the ints (or numbers) are read in from the file is causing me problems. It currently reads all of the numbers from a file and disregards any other chars such as letters. Hence the following would be collected:
Example.txt:
h568fj34fj390fjkkkkf385837692g1234567891011121314fh
Incorrect output:
5683439038583769
Correct output:
1234567891011121
But I only want to collect a string of consecutive numbers (if there are other chars like letters inbetween the numbers they would not be counted) and the method would such for a series of numbers (16 in total) with no other chars like letters inbetween.
My current logic is as follows:
var register = new StringBuilder();
using(var stream = File.Open(readme.txt, FileMode.Open))
{
bool execEnd = false;
bool fileEnded = false;
int buffer;
while(execEnd = true)
{
while(register.Length < 16 && !fileEnded)
{
buffer = stream.ReadByte();
if(buffer == -1)
{
fileEnded = true;
break;
}
var myChar = (char)buffer;
if(Char.IsNumber(myChar))
register.Append(myChar);
}
This should be your required Regex: \d{16}
So you could use LINQ to find all matches:
var allMatches =
System.IO.File.ReadLines(path)
.Select(l => System.Text.RegularExpressions.Regex.Match(l, #"\d{16}"))
.Where(m => m.Success)
.ToList();
foreach (var match in allMatches)
{
String number = match.Value;
}
string input = "h568fj34fj390fjkkkkf385837692g1234567891011121314fh";
string num = System.Text.RegularExpressions.Regex.Match(input, #"\d{16}").Value;

Getting parts of a string and combine them in C#?

I have a string like this: C:\Projects\test\whatever\files\media\10\00\00\80\test.jpg
Now, what I want to do is to dynamically combine the last 4 numbers, in this case its 10000080 as result. My idea was ti split this and combine them in some way, is there an easier way? I cant rely on the array index, because the path can be longer or shorter as well.
Is there a nice way to do that?
Thanks :)
A compact way using string.Join and Regex.Split.
string text = #"C:\Projects\test\whatever\files\media\10\00\00\80\test.jpg";
string newString = string.Join(null, Regex.Split(text, #"[^\d]")); //10000080
Use String.Split
String toSplit = "C:\Projects\test\whatever\files\media\10\00\00\80\test.jpg";
String[] parts = toSplit.Split(new String[] { #"\" });
String result = String.Empty;
for (int i = 5, i > 1; i--)
{
result += parts[parts.Length - i];
}
// Gives the result 10000080
You can rely on array index if the last part always is the filename.
since the last part is always
array_name[array_name.length - 1]
the 4 parts before that can be found by
array_name[array_name.length - 2]
array_name[array_name.length - 3]
etc
If you always want to combine the last four numbers, split the string (use \ as the separator), start counting from the last part and take 4 numbers, or the 4 almost last parts.
If you want to take all the digits, just scan the string from start to finish and copy just the digits to a new string.
string input = "C:\Projects\test\whatever\files\media\10\00\00\80\test.jpg";
string[] parts = toSplit.Split(new char[] {'\\'});
IEnumerable<string> reversed = parts.Reverse();
IEnumerable<string> selected = reversed.Skip(1).Take(4).Reverse();
string result = string.Concat(selected);
The idea is to extract the parts, reverse them to keep only the last 4 (excluding the file name) and re reversing to rollback to the initial order, then concat.
Using LINQ:
string path = #"C:\Projects\test\whatever\files\media\10\00\00\80\test.jpg";
var parts = Path.GetDirectoryName(path).Split('\\');
string numbersPart = parts.Skip(parts.Count() - 4)
.Aggregate((acc, next) => acc + next);
Result: "10000080"
var r = new Regex(#"[^\d+]");
var match = r
.Split(#"C:\Projects\test\whatever\files\media\10\00\00\80\test.jpg")
.Aggregate((i, j) => i + j);
return match.ToString();
to find the number you can use regex:
(([0-9]{2})\\){4}
use concat all inner Group ([0-9]{2}) to get your searched number.
This will always find your searched number in any position in the given string.
Sample Code:
static class TestClass {
static void Main(string[] args) {
string[] tests = { #"C:\Projects\test\whatever\files\media\10\00\00\80\test.jpg",
#"C:\Projects\test\whatever\files\media\10\00\00\80\some\foldertest.jpg",
#"C:\10\00\00\80\test.jpg",
#"C:\10\00\00\80\test.jpg"};
foreach (string test in tests) {
int number = ExtractNumber(test);
Console.WriteLine(number);
}
Console.ReadLine();
}
static int ExtractNumber(string path) {
Match match = Regex.Match(path, #"(([0-9]{2})\\){4}");
if (!match.Success) {
throw new Exception("The string does not contain the defined Number");
}
//get second group that is where the number is
Group #group = match.Groups[2];
//now concat all captures
StringBuilder builder = new StringBuilder();
foreach (var capture in #group.Captures) {
builder.Append(capture);
}
//pares it as string and off we go!
return int.Parse(builder.ToString());
}
}

Categories

Resources