Determine if string is newline in C#

Determine if string is newline in C# - c#

I am somehow unable to determine whether a string is newline or not. The string which I use is read from a file written by Ultraedit using DOS Terminators CR/LF. I assume this would equate to "\r\n" or Environment.NewLine in C#. However , when I perform a comparison like this it always seem to return false :
if(str==Environment.NewLine)
Anyone with a clue on what's going on here?

How are the lines read? If you're using StreamReader.ReadLine (or something similar), the new line character will not appear in the resulting string - it will be String.Empty or (i.e. "").

Are you sure that the whole string only contains a NewLine and nothing more or less? Have you already tried str.Contains(Environment.NewLine)?

The most obvious troubleshooting step would be to check what the value of str actually is. Just view it in the debugger or print it out.

Newline is "\r\n", not "/r/n". Maybe there's more than just the newline.... what is the string value in Debug Mode?
You could use the new .NET 4.0 Method:
String.IsNullOrWhiteSpace

This is a very valid question.
Here is the answer. I have invented a kludge that takes care of it.
static bool StringIsNewLine(string s)
{
return (!string.IsNullOrEmpty(s)) &&
(!string.IsNullOrWhiteSpace(s)) &&
(((s.Length == 1) && (s[0] == 8203)) ||
((s.Length == 2) && (s[0] == 8203) && (s[1] == 8203)));
}
Use it like so:
foreach (var line in linesOfMyFile)
{
if (StringIsNewLine(line)
{
// Ignore reading new lines
continue;
}
// Do the stuff only for non-empty lines
...
}

Related

Code returning false on equal compare?

Problem I currently have:
My server returns data back to the client, this includes a name. Now I want the client to grab this name and compare it. However for the past 3 hours I am stuck at this problem and I dont want to cheap fix around it.
My server returns a value and then a name, ex: random23454#NAMEHERE
I split the value using:
string[] values = returndata.Split('#');
And then I am doing:
if (textBox3.Text == values[1]) {
MessageBox.Show("equal");
}
However, the problem here is. I cant get it to be equal, I tried other methods but it just dont display equal.
What I have done:
Print textBox3.Text to a textbox and print values[1] to a other textbox and compared with my eye and mouse (Using invoke due to threading).
Used the .Trim() function
Using the .ToString() on values[1] (Just for the hell of it)
Assigned them both to a complete new string, trimmed them and compared them
Dragged the comparing outside the thread using:
this.Invoke((MethodInvoker)delegate()
{
outside(name);
});
and perform the same check.
My code:
string returndata = System.Text.Encoding.ASCII.GetString(inStream);
readData = "" + returndata;
if (readData.Contains("#") && readData.Contains("random"))
{
string[] values = returndata.Split('#');
string name = values[1].Trim();
if (textBox3.Text == name)
{
MessageBox.Show("true");
}
else
{
MessageBox.Show("false");
this.Invoke((MethodInvoker)delegate()
{
outside(name);
});
}
What else can I do? I just dont understand that it is not equal..
Thanks in advance.

The data you're getting back from the server could be an array of bytes. Try converting the response to a string first before splitting. Also try printing the response (or the response's type) to console to see what you get before going any further.
Also make sure the length of each string is the same. Maybe give utf-8 a try instead of ASCII? Like so:
System.Text.Encoding.UTF8.GetString(inStream);

string name = values[1].Trim();
I think you want values[2] here. The way I read the documentation for Split, the element at index 1 will be the (blank) separator indicator.

Validating Format for Numbers and letters in specific position of a string

I have the following String character consisting of two letters followed by four numbers, followed by two letters, such as xy1234xy. I need to be able to determined if the user entry meets this criteria.
For other projects, I used a Take, Skip as I would rather not use Regix if there is an alternative:
I have used something like this in the past but having issues with the mid section:
public static void MemberNumberInput(string checkNumberLetter)
{
IsValidInput = (checkNumberLetter.Take(2).All(char.IsLetter) &
(checkNumberLetter.Skip(2).All(char.IsDigit) &
(checkNumberLetter.Take(2).All(char.IsLetter) &
(checkNumberLetter.Trim().Length == 8))));
}
}
Thanks everyone:
Guy

How about like this?
string myString = "xy1234xy";
if(Regex.IsMatch(myString, "^[A-Za-z]{2}[0-9]{4}[A-Za-z]{2}$"))
{
// Do something
}
^[A-Za-z]{2}[0-9]{4}[A-Za-z]{2}$
Edit live on Debuggex
Here a DEMO.

If you absolutely cannot use a regex, I'd think the most straightforward way is to just check them one-by-one:
return checkNumberLetter.Length == 8 &&
checkNumberLetter[0].IsLetter() &&
checkNumberLetter[1].IsLetter() &&
checkNumberLetter[2].IsDigit() &&
checkNumberLetter[3].IsDigit() &&
checkNumberLetter[4].IsDigit() &&
checkNumberLetter[5].IsDigit() &&
checkNumberLetter[6].IsLetter() &&
checkNumberLetter[7].IsLetter();
You could do this in a for loop or some set of LINQ queries, but my opinion is this is the simplest and most straightforward option.

Directory Exists with Path Combine vs String Concatenation

So as I am building a Folder/File checking conditional, and a co-worker says it is "better" to use Path.Combine:
string finalPath = Path.Combine(folder, "file.txt");
as opposed to the way I was doing it with
string finalPath = folder + "\\file.txt";
Any sound reasoning this is "better?"

It's an interesting question;
You could, of course, write something like:
string finalPath = String.Format("{0}\\file.txt", folder);
To achieve the result you want.
Using ILSpy, though, lets see why Path.Combine is better.
The overload you are calling is:
public static string Combine(string path1, string path2)
{
if (path1 == null || path2 == null)
{
throw new ArgumentNullException((path1 == null) ? "path1" : "path2");
}
Path.CheckInvalidPathChars(path1, false);
Path.CheckInvalidPathChars(path2, false);
return Path.CombineNoChecks(path1, path2);
}
The advantages are obvious; firstly, the function checks for null values and throws the appropriate exception. Then it checks for illegal characters in either of the arguments, and throws an appropriate exception. Once it is satisfied, it calls Path.CombineNoChecks:
private static string CombineNoChecks(string path1, string path2)
{
if (path2.Length == 0)
{
return path1;
}
if (path1.Length == 0)
{
return path2;
}
if (Path.IsPathRooted(path2))
{
return path2;
}
char c = path1[path1.Length - 1];
if (c != Path.DirectorySeparatorChar && c != Path.AltDirectorySeparatorChar && c != Path.VolumeSeparatorChar)
{
return path1 + Path.DirectorySeparatorChar + path2;
}
return path1 + path2;
}
The most interesting thing here are the characters it supports;
Path.DirectorySeparatorChar = "\\"
Path.AltDirectorySeparatorChar = "/"
Path.VolumeSeparatorChar = ":"
So it will also support paths where the separator is the wrong way around (like from a urn file://C:/blah, too)
In short, it's better because it gives you validation, a degree of portability (the 3 constants shown above can be defined on a per framework-OS basis), and has support for more than one type of path that you commonly encounter.

try these two to see the difference.... It can handle URI and standard paths. So always use Path.Combine.
Console.WriteLine(Path.Combine(#"file:///c:/temp/", "x.xml"));
Output file:///c:/temp/x.xml
Console.WriteLine(Path.Combine(#"C:\test", "x.xml"));
Output C:\test\x.xml

Yes, it's more portable in the case that the file-path separator is different to \

First you can use this notation #"\file.txt instead of "\\file.txt";
Second, let .Net care about fixing the path. There is a reason we have it.
You can be 100% sure that you've done it correctly but if you start combining paths by hand everywhere in your code, there is always a chance to create bugs.
A simple example.
User enters a path and you want to create a subfolder named temp there. What will you do?
If no backslash at the end, add one, else do this, otherwise do the other... etc.
With Path.Combine() you don't have to do checking. You can concentrate on the actual logic of your application.

One really could thing plus the other comments, it is the capability to combine many parts of the directory you want to create.
As an example is this:
Path.Combine(root, nextFolder, childfolder, file);
It supports many characters as it receives an array of string, so it is capable to create the right directory in one single executed line.
Regards,

Does FileInfo.Extension return the last . pattern, or something else?

I'm curious what exactly the behavior is on the following:
FileInfo info = new FileInfo("C:/testfile.txt.gz");
string ext = info.Extension;
Will this return ".txt.gz" or ".gz"?
What is the behavior with even more extensions, such as ".txt.gz.zip" or something like that?
EDIT:
To be clear, I've already tested this. I would like an explanation of the property.

It will return .gz, but the explanation from MSDN (FileSystemInfo.Extension Property) isn't clear why:
"The Extension property returns the FileSystemInfo extension, including the period (.). For example, for a file c:\NewFile.txt, this property returns ".txt"."
So I looked up the code of the Extension property with reflector:
public string Extension
{
get
{
int length = this.FullPath.Length;
int startIndex = length;
while (--startIndex >= 0)
{
char ch = this.FullPath[startIndex];
if (ch == '.')
{
return this.FullPath.Substring(startIndex, length - startIndex);
}
if (((ch == Path.DirectorySeparatorChar) || (ch == Path.AltDirectorySeparatorChar)) || (ch == Path.VolumeSeparatorChar))
{
break;
}
}
return string.Empty;
}
}
It's check every char from the end of the filepath till it finds a dot, then a substring is returned from the dot to the end of the filepath.

[TestCase(#"C:/testfile.txt.gz", ".gz")]
[TestCase(#"C:/testfile.txt.gz.zip", ".zip")]
[TestCase(#"C:/testfile.txt.gz.SO.jpg", ".jpg")]
public void TestName(string fileName, string expected)
{
FileInfo info = new FileInfo(fileName);
string actual = info.Extension;
Assert.AreEqual(actual, expected);
}
All pass

It returns the extension from the last dot, because it can't guess whether another part of the filename is part of the extension. In the case of testfile.txt.gz, you could argue that the extension is .txt.gz, but what about System.Data.dll? Should the extension be .Data.dll? Probably not... There's no way to guess, so the Extension property doesn't try to.

The file extension starts at the last dot. Unfortunately, the documentation for FileSystemInfo.Extension doesn't answer that, but it logically must return the same value as Path.GetExtension, for which the documentation states:
Remarks
The extension of path is obtained by searching path for a period (.), starting with the last character in path and continuing toward the start of path. If a period is found before a DirectorySeparatorChar or AltDirectorySeparatorChar character, the returned string contains the period and the characters after it; otherwise, Empty is returned.
For a list of common I/O tasks, see Common I/O Tasks.
It would be nice there is an authoritative answer on file names in general, but I'm having trouble finding it.

Parsing CSV File enclosed with quotes in C#

I've seen lots of samples in parsing CSV File. but this one is kind of annoying file...
so how do you parse this kind of CSV
"1",1/2/2010,"The sample ("adasdad") asdada","I was pooping in the door "Stinky", so I'll be damn","AK"

The best answer in most cases is probably #Jim Mischel's. TextFieldParser seems to be exactly what you want for most conventional cases -- though it strangely lives in the Microsoft.VisualBasic namespace! But this case isn't conventional.
The last time I ran into a variation on this issue where I needed something unconventional, I embarrassingly gave up on regexp'ing and bullheaded a char by char check. Sometimes, that's not-wrong enough to do. Splitting a string isn't as difficult a problem if you byte push.
So I rewrote for this case as a string extension. I think this is close.
Do note that, "I was pooping in the door "Stinky", so I'll be damn", is an especially nasty case. Without the *** STINKY CONDITION *** code, below, you'd get I was pooping in the door "Stinky as one value and so I'll be damn" as the other.
The only way to do better than that for any anonymous weird splitter/escape case would be to have some sort of algorithm to determine the "usual" number of columns in each row, and then check for, in this case, fixed length fields like your AK state entry or some other possible landmark as a sort of normalizing backstop for nonconformist columns. But that's serious crazy logic that likely isn't called for, as much fun as it'd be to code. As #Vash points out, you're better off following some standard and coding a little more OFfensively.
But the problem here is probably easier than that. The only lexically meaningful case is the one in your example -- ", -- double quote, comma, and then a space. So that's what the *** STINKY CONDITION *** code checks. Even so, this code is getting nastier than I'd like, which means you have ever stranger edge cases, like "This is also stinky," a f a b","Now what?" Heck, even "A,"B","C" doesn't work in this code right now, iirc, since I treat the begin and end chars as having been escape pre- and post-fixed. So we're largely back to #Vash's comment!
Apologies for all the brackets for one-line if statements, but I'm stuck in a StyleCop world right now. I'm not necessarily suggesting you use this -- that strictEscapeToSplitEvaluation plus the STINKY CONDITION makes this a little complex. But it's worth keeping in mind that a normal csv parser that's intelligent about quotes is significantly more straightforward to the point of being tedious, but otherwise trivial.
namespace YourFavoriteNamespace
{
using System;
using System.Collections.Generic;
using System.Text;
public static class Extensions
{
public static Queue<string> SplitSeeingQuotes(this string valToSplit, char splittingChar = ',', char escapeChar = '"',
bool strictEscapeToSplitEvaluation = true, bool captureEndingNull = false)
{
Queue<string> qReturn = new Queue<string>();
StringBuilder stringBuilder = new StringBuilder();
bool bInEscapeVal = false;
for (int i = 0; i < valToSplit.Length; i++)
{
if (!bInEscapeVal)
{
// Escape values must come immediately after a split.
// abc,"b,ca",cab has an escaped comma.
// abc,b"ca,c"ab does not.
if (escapeChar == valToSplit[i] && (!strictEscapeToSplitEvaluation || (i == 0 || (i != 0 && splittingChar == valToSplit[i - 1]))))
{
bInEscapeVal = true; // not capturing escapeChar as part of value; easy enough to change if need be.
}
else if (splittingChar == valToSplit[i])
{
qReturn.Enqueue(stringBuilder.ToString());
stringBuilder = new StringBuilder();
}
else
{
stringBuilder.Append(valToSplit[i]);
}
}
else
{
// Can't use switch b/c we're comparing to a variable, I believe.
if (escapeChar == valToSplit[i])
{
// Repeated escape always reduces to one escape char in this logic.
// So if you wanted "I'm ""double quote"" crazy!" to come out with
// the double double quotes, you're toast.
if (i + 1 < valToSplit.Length && escapeChar == valToSplit[i + 1])
{
i++;
stringBuilder.Append(escapeChar);
}
else if (!strictEscapeToSplitEvaluation)
{
bInEscapeVal = false;
}
// *** STINKY CONDITION ***
// Kinda defense, since only `", ` really makes sense.
else if ('"' == escapeChar && i + 2 < valToSplit.Length &&
valToSplit[i + 1] == ',' && valToSplit[i + 2] == ' ')
{
i = i+2;
stringBuilder.Append("\", ");
}
// *** EO STINKY CONDITION ***
else if (i+1 == valToSplit.Length || (i + 1 < valToSplit.Length && valToSplit[i + 1] == splittingChar))
{
bInEscapeVal = false;
}
else
{
stringBuilder.Append(escapeChar);
}
}
else
{
stringBuilder.Append(valToSplit[i]);
}
}
}
// NOTE: The `captureEndingNull` flag is not tested.
// Catch null final entry? "abc,cab,bca," could be four entries, with the last an empty string.
if ((captureEndingNull && splittingChar == valToSplit[valToSplit.Length-1]) || (stringBuilder.Length > 0))
{
qReturn.Enqueue(stringBuilder.ToString());
}
return qReturn;
}
}
}
Probably worth mentioning that the "answer" you gave yourself doesn't have the "Stinky" problem in its sample string. ;^)
[Understanding that we're three years after you asked,] I will say that your example isn't as insane as folks here make out. I can see wanting to treat escape characters (in this case, ") as escape characters only when they're the first value after the splitting character or, after finding an opening escape, stopping only if you find the escape character before a splitter; in this case, the splitter is obviously ,.
If the row of your csv is abc,bc"a,ca"b, I would expect that to mean we've got three values: abc, bc"a, and ca"b.
Same deal in your "The sample ("adasdad") asdada" column -- quotes that don't begin and end a cell value aren't escape characters and don't necessarily need doubling to maintain meaning. So I added a strictEscapeToSplitEvaluation flag here.
Enjoy. ;^)

I very strongly recommend using TextFieldParser. Hand-coded parsers that use String.Split or regular expressions almost invariably mishandle things like quoted fields that have embedded quotes or embedded separators.
I would be surprised, though, if it handled your particular example. As others have said, that line is, at best, ambiguous.

Split based on
",
I would use MyString.IndexOf("\","
And then substring the parts. Other then that im sure someone written a csv parser out there that can handle this :)

I found a way to parse this malformed CSV. I looked for a pattern and found it.... I first replace (",") with a character... like "¤" and then split it...
from this:
"Annoying","CSV File","poop#mypants.com",1999,01-20-2001,"oh,boy",01-20-2001,"yeah baby","yeah!"
to this:
"Annoying¤CSV File¤poop#mypants.com",1999,01-20-2001,"oh,boy",01-20-2001,"yeah baby¤yeah!"
then split it:
ArrayA[0]: "Annoying //this value will be trimmed by replace("\"","") same as the array[4]
ArrayA[1]: CSV File
ArrayA[2]: poop#mypants.com",1999,01-20-2001,"oh,boy",01-20-2001,"yeah baby
ArrayA[3]: yeah!"
after splitting it, I will replace strings from ArrayA[2] ", and ," with ¤ and then split it again
from this
ArrayA[2]: poop#mypants.com",1999,01-20-2001,"oh,boy",01-20-2001,"yeah baby
to this
ArrayA[2]: poop#mypants.com¤1999,01-20-2001¤oh,boy¤01-20-2001¤yeah baby
then split it again and would turn to this
ArrayB[0]: poop#mypants.com
ArrayB[1]: 1999,01-20-2001
ArrayB[2]: oh,boy
ArrayB[3]: 01-20-2001
ArrayB[4]: yeah baby
and lastly... I'll split the Year only and the date from ArrayB[1] with , to ArrayC
It's tedious but there's no other way to do it...

There is one another open source library, Cinchoo ETL, handle quoted string fine. Here is sample code.
string csv = #"""1"",1/2/2010,""The sample(""adasdad"") asdada"",""I was pooping in the door ""Stinky"", so I'll be damn"",""AK""";
using (var r = ChoCSVReader.LoadText(csv)
.QuoteAllFields()
)
{
foreach (var rec in r)
Console.WriteLine(rec.Dump());
}
Output:
[Count: 5]
Key: Column1 [Type: Int64]
Value: 1
Key: Column2 [Type: DateTime]
Value: 1/2/2010 12:00:00 AM
Key: Column3 [Type: String]
Value: The sample(adasdad) asdada
Key: Column4 [Type: String]
Value: I was pooping in the door Stinky, so I'll be damn
Key: Column5 [Type: String]
Value: AK

You could split the string by ",". It is recomended that the csv file could each cell value should be enclosed in quotes like "1","2","3".....

I don't see how you could if each line is different. This line is a malformed for CSV. Quotes contained within a value must be doubled as shown below. I can't even tell for sure where the values should be terminated.
"1",1/2/2010,"The sample (""adasdad"") asdada","I was pooping in the door ""Stinky"", so I'll be damn","AK"
Here's my code to parse a CSV file but I don't see how any code would know how to handle your line because it's malformed.

You might want to give CsvReader a try. It will handle quoted string fine, so you just will have to remove leading and trailing quotes.
It will fail if your strings contains a coma. To avoid this, the quotes needs to be doubled as said in other answers.

As no (decent) .csv parser can parse non-csv-data correctly, the task isn't to parse the data, but to fix the file(s) (and then to parse the correct data).
To fix the data you need a list of bad rows (to be sent to the person responsible for the garbage for manual editing). To get such a list, you can
use Access with a correct import specification to import the file. You'll get a list of import failures.
write a script/program that opens the file via the OLEDB text driver.
Sample file:
"Id","Remark","DateDue"
1,"This is good",20110413
2,"This is ""good""",20110414
3,"This is ""good"","bad",and "ugly",,20110415
4,"This is ""good""" again,20110415
Sample SQL/Result:
SELECT * FROM [badcsv01.csv]
Id Remark DateDue
1 This is good 4/13/2011
2 This is "good" 4/14/2011
3 This is "good", NULL
4 This is "good" again 4/15/2011
SELECT * FROM [badcsv01.csv] WHERE DateDue Is Null
Id Remark DateDue
3 This is "good", NULL

First you will do it for the columns names:
DataTable pbResults = new DataTable();
OracleDataAdapter oda = new OracleDataAdapter(cmd);
oda.Fill(pbResults);
StringBuilder sb1 = new StringBuilder();
StringBuilder sb2 = new StringBuilder();
IEnumerable<string> columnNames = pbResults.Columns.Cast<DataColumn>().Select(column => column.ColumnName);
sb1.Append(string.Join("\"" + "," + "\"", columnNames));
sb2.Append("\"");
sb2.Append(sb1);
sb2.AppendLine("\"");
Second you will do it for each row:
foreach (DataRow row in pbResults.Rows)
{
IEnumerable<string> fields = row.ItemArray.Select(field => field.ToString());
sb2.Append("\"");
sb2.Append(string.Join("\"" + "," + "\"", fields));
sb2.AppendLine("\"");
}

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Determine if string is newline in C# - c#

How are the lines read? If you're using StreamReader.ReadLine (or something similar), the new line character will not appear in the resulting string - it will be String.Empty or (i.e. "").

Are you sure that the whole string only contains a NewLine and nothing more or less? Have you already tried str.Contains(Environment.NewLine)?

The most obvious troubleshooting step would be to check what the value of str actually is. Just view it in the debugger or print it out.

Newline is "\r\n", not "/r/n". Maybe there's more than just the newline.... what is the string value in Debug Mode? You could use the new .NET 4.0 Method: String.IsNullOrWhiteSpace

Related

Code returning false on equal compare?

Validating Format for Numbers and letters in specific position of a string

Directory Exists with Path Combine vs String Concatenation

Does FileInfo.Extension return the last . pattern, or something else?

Parsing CSV File enclosed with quotes in C#

Categories

Resources

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Determine if string is newline in C# - c#

How are the lines read? If you're using StreamReader.ReadLine (or something similar), the new line character will not appear in the resulting string - it will be String.Empty or (i.e. "").

Are you sure that the whole string only contains a NewLine and nothing more or less? Have you already tried str.Contains(Environment.NewLine)?

The most obvious troubleshooting step would be to check what the value of str actually is. Just view it in the debugger or print it out.

Newline is "\r\n", not "/r/n". Maybe there's more than just the newline.... what is the string value in Debug Mode? You could use the new .NET 4.0 Method: String.IsNullOrWhiteSpace

Related

Code returning false on equal compare?

Validating Format for Numbers and letters in specific position of a string

Directory Exists with Path Combine vs String Concatenation

Does FileInfo.Extension return the last *.* pattern, or something else?

Parsing CSV File enclosed with quotes in C#

Categories

Resources

Does FileInfo.Extension return the last . pattern, or something else?