So I'm trying to read from a text file and store each field into an array. But when I tried to convert accountNumber to an Int, I get an error.
public bool matchCustomer(int accountID){
string[] data = null;
string line = Global.currentFile.reader.ReadLine();
while (line != null)
{
data = line.Split('*');
this.accountNumber = Convert.ToInt32(data[0]);
line = Global.currentFile.reader.ReadLine();
if (accountID == this.accountNumber)
{
return true;
}
}
return false;
}
That's because data[0] isn't convertible into an int. What is data[0] at runtime?
You could use:
int value;
if(Int32.TryParse(data[0], out value))
{
accountNumber = value;
}
else
{
//Something when data[0] can't be turned into an int.
//You'll have to decide this logic.
}
Likely, because you split by delimiter * in the string:
12345 * Shrek * 1209 * 100,000 * 50,000
You left with a spaced number "12345 " instead of all numbers "12345". This causes it to be unconvertible. Try to apply Trim:
this.accountNumber = Convert.ToInt32(data[0].Trim());
Also, beware of strings with thousands separator comma (50,000 and 100,000). You might need to replace it with empty string if it is unconvertible:
data[4].Replace(",","").Trim();
Other two answers addressed the issue and fix, I thought of providing another alternative which uses Linq.
You can replace complete while block content with this.
return line.Split('*').Select(s=> s.Trim().Replace(",", ""))
.Where(c=> Regex.IsMatch(c.Trim(), #"\d+"))
.Select(s=>int.Parse(s.Trim()))
.Any(e=>e == accountId);
Working Demo
Related
I am attempting to create a datatable from an Excel spreadsheet using OpenXML. When getting a row's cell value using Cell.CellValue.innerXml the value returned for a monetary value entered by the user and visible on the spreadsheet is not the same value interpreted.
The spreadsheet cell is formatted as Text and the cell value is 570.81. When obtaining the data in OpenXML the value is interpreted as 570.80999999999995.
This method is used for many different excel imports where the data type for a cell by header or column index is not known when building the table.
I've seen a few post about the Ecma Office Open XML File Formats Standard and mention of numFmtId. Could this be of value?
I assume that since the data type is text and the number has two decimal places that there must be some assumption that the cell has been rounded (even though no formula exists).
I am hopeful someone can offer a solution for properly interpreting the data.
Below is the GetCellValue method:
private static string GetCellValue(SharedStringTablePart stringTablePart, DocumentFormat.OpenXml.Spreadsheet.Cell cell,DocumentFormat.OpenXml.Spreadsheet.Stylesheet styleSheet)
{
string value = cell.CellValue.InnerXml;
if (cell.DataType != null && cell.DataType.Value == DocumentFormat.OpenXml.Spreadsheet.CellValues.SharedString)
{
return stringTablePart.SharedStringTable.ChildElements[Int32.Parse(value)].InnerText;
}
else
{
if (cell.StyleIndex != null)
{
DocumentFormat.OpenXml.Spreadsheet.CellFormat cellFormat = (DocumentFormat.OpenXml.Spreadsheet.CellFormat)styleSheet.CellFormats.ChildElements[(int)cell.StyleIndex.Value];
int formatId = (int)cellFormat.NumberFormatId.Value;
if (formatId == 14) //[h]:mm:ss
{
DateTime newDate = DateTime.FromOADate(double.Parse(value));
value = newDate.Date.ToString(CultureInfo.InvariantCulture);
}
}
return value;
}
}
As you point out in your question, the format is stored separately from the cell value using number formats in the stylesheet.
You should be able to extend the code you have for formatting dates to include formatting for numbers. Essentially you need to grab the NumberingFormat that corresponds to the cellFormat.NumberFormatId.Value you are already reading. The NumberingFormat can be found in the styleSheet.NumberingFormats elements.
Once you have this you can access the FormatCode property of the NumberingFormat which you can then use to format your data as you see fit.
Unfortunately the format is not quite that straightforward to use. Firstly, according to MSDN here not all formats are written to the file so I guess you will have to have those somewhere accessible and load them depending on the NumberFormatId you have.
Secondly the format of the format string is not compatable with C# so you'll need to do some manipulation. Details of the format layout can be found on MSDN here.
I have knocked together some sample code that handles the currency situation you have in your question but you may need to give some more thought to the parsing of the excel format string into a C# one.
private static string GetCellValue(SharedStringTablePart stringTablePart, DocumentFormat.OpenXml.Spreadsheet.Cell cell, DocumentFormat.OpenXml.Spreadsheet.Stylesheet styleSheet)
{
string value = cell.CellValue.InnerXml;
if (cell.DataType != null && cell.DataType.Value == DocumentFormat.OpenXml.Spreadsheet.CellValues.SharedString)
{
return stringTablePart.SharedStringTable.ChildElements[Int32.Parse(value)].InnerText;
}
else
{
if (cell.StyleIndex != null)
{
DocumentFormat.OpenXml.Spreadsheet.CellFormat cellFormat = (DocumentFormat.OpenXml.Spreadsheet.CellFormat)styleSheet.CellFormats.ChildElements[(int)cell.StyleIndex.Value];
int formatId = (int)cellFormat.NumberFormatId.Value;
if (formatId == 14) //[h]:mm:ss
{
DateTime newDate = DateTime.FromOADate(double.Parse(value));
value = newDate.Date.ToString(CultureInfo.InvariantCulture);
}
else
{
//find the number format
NumberingFormat format = styleSheet.NumberingFormats.Elements<NumberingFormat>()
.FirstOrDefault(n => n.NumberFormatId == formatId);
double temp;
if (format != null
&& format.FormatCode.HasValue
&& double.TryParse(value, out temp))
{
//we have a format and a value that can be represented as a double
string actualFormat = GetActualFormat(format.FormatCode, temp);
value = temp.ToString(actualFormat);
}
}
}
return value;
}
}
private static string GetActualFormat(StringValue formatCode, double value)
{
//the format is actually 4 formats split by a semi-colon
//0 for positive, 1 for negative, 2 for zero (I'm ignoring the 4th format which is for text)
string[] formatComponents = formatCode.Value.Split(';');
int elementToUse = value > 0 ? 0 : (value < 0 ? 1 : 2);
string actualFormat = formatComponents[elementToUse];
actualFormat = RemoveUnwantedCharacters(actualFormat, '_');
actualFormat = RemoveUnwantedCharacters(actualFormat, '*');
//backslashes are an escape character it seems - I'm ignoring them
return actualFormat.Replace("\"", ""); ;
}
private static string RemoveUnwantedCharacters(string excelFormat, char character)
{
/* The _ and * characters are used to control lining up of characters
they are followed by the character being manipulated so I'm ignoring
both the _ and * and the character immediately following them.
Note that this is buggy as I don't check for the preceeding
backslash escape character which I probably should
*/
int index = excelFormat.IndexOf(character);
int occurance = 0;
while (index != -1)
{
//replace the occurance at index using substring
excelFormat = excelFormat.Substring(0, index) + excelFormat.Substring(index + 2);
occurance++;
index = excelFormat.IndexOf(character, index);
}
return excelFormat;
}
Given a sheet with the value 570.80999999999995 formatted using currency (in the UK) the output I get is £570.81.
How can I achieve formatting string to custom format:
int value = 5000;
String.Format("{0:## ###}", value);
value.ToString("##");
but with value as string, without using conversion to number?
something like this:
String.Format("{0:## ###}, "5000");
** UPDATE:
I'm trying to create a generic function:
public string FormatString(string value, string format = "") {
if (value == null){
return "";
}
return String.Format("{0:" + format + "}", value);
}
public bool OtherFunction(id){
var data = dc.GetData(id);
ViewBag.DescriptionText = FormatString(data.Description).Replace("\n", "<br />");
ViewBag.Phone = FormatString(data.Phone, "(##) ####-#####");
ViewBag.City= FormatString(data.City);
[...]
}
I don't think something like this exists. Like Jon said, this was design for numbers.
If you want just "format" with # you could write simple function, something like this
public string FormatString(string value, string format = "")
{
if (String.IsNullOrEmpty(value) || String.IsNullOrEmpty(format))
return value;
var newValue = new StringBuilder(format);
for (int i = 0; i < newValue.Length; i++)
{
if (newValue[i] == '#')
if (value.Length > 0)
{
newValue[i] = value[0];
value = value.Substring(1);
}
else
{
newValue[i] = '0';
}
}
return newValue.ToString();
}
Of course this is very simple one. You will have to check and decide what to do if format is too long (like here: fill with '0') and when he format is too short (here: just 'truncate' rest of value).
But I think you have an idea how to do this.
Somewhere on my disk I have code for something like this: formatting number in special ways/pattern for invoice number. If I will find this, I'll make some post on blog and paste the link
"5000" is a string. The only overload available for string.ToString() is the one with an IFormatProvider [1]. While you could actually implement that, you'll probably end up in something similar to int.Parse() which you don't like.
[1] http://msdn.microsoft.com/de-de/library/29dxe1x2(v=vs.110).aspx
I need some sort of conversion/mapping that, for example, is done by CLCL clipboard manager.
What it does is like that:
I copy the following Unicode text: ūī
And CLCL converts it to: ui
Is there any technique to do such a conversion? Or maybe there are mapping tables that can be used to convert, let's say, symbol ū is mapped to u.
UPDATE
Thanks to all for help. Here is what I came with (a hybrid of two solutions), one posted by Erik Schierboom and one taken from http://blogs.infosupport.com/normalizing-unicode-strings-in-c/#comment-8984
public static string ConvertUnicodeToAscii(string unicodeStr, bool skipNonConvertibleChars = false)
{
if (string.IsNullOrWhiteSpace(unicodeStr))
{
return unicodeStr;
}
var normalizedStr = unicodeStr.Normalize(NormalizationForm.FormD);
if (skipNonConvertibleChars)
{
return new string(normalizedStr.ToCharArray().Where(c => (int) c <= 127).ToArray());
}
return new string(
normalizedStr.Where(
c =>
{
UnicodeCategory category = CharUnicodeInfo.GetUnicodeCategory(c);
return category != UnicodeCategory.NonSpacingMark;
}).ToArray());
}
I have used the following code for some time:
private static string NormalizeDiacriticalCharacters(string value)
{
if (value == null)
{
throw new ArgumentNullException("value");
}
var normalised = value.Normalize(NormalizationForm.FormD).ToCharArray();
return new string(normalised.Where(c => (int)c <= 127).ToArray());
}
In general, it is not possible to convert Unicode to ASCII because ASCII is a subset of Unicode.
That being said, it is possible to convert characters within the ASCII subset of Unicode to Unicode.
In C#, generally there's no need to do the conversion, since all strings are Unicode by default anyway, and all components are Unicode-aware, but if you must do the conversion, use the following:
string myString = "SomeString";
byte[] asciiString = System.Text.Encoding.ASCII.GetBytes(myString);
edit: the order might change as you can see in the below example, both string have same name but different order....
How would you go after checking to see if the both string array match?
the below code returns true but in a reality its should return false since I have extra string array in the _check
what i am trying to achieve is to check to see if both string array have same number of strings.
string _exists = "Adults,Men,Women,Boys";
string _check = "Men,Women,Boys,Adults,fail";
if (_exists.All(s => _check.Contains(s))) //tried Equal
{
return true;
}
else
{
return false;
}
string _exists = "Adults,Men,Women,Boys";
string _check = "Men,Women,Boys,Adults,fail";
bool b = _exists.Split(',').OrderBy(s=>s)
.SequenceEqual(_check.Split(',').OrderBy(s=>s));
Those are not array of strings, but two strings.
So, you actually need to split them into substrings before checking for the content equality.
You can do in this way:
string _exists = "Adults,Men,Women,Boys";
string _check = "Men,Women,Boys,Adults,fail";
var checks = _check.Split(',');
var exists = _exists.Split(',');
bool stringsEqual = checks.OrderBy(x => x).SequenceEqual(exists.OrderBy(x => x));
To speed up a bit some special cases, you could check for length before calling the LINQ code (avoiding the two OrderBy's in case of different lengths). Furthermore, to save memory, you could use in-place sort on the splits arrays, i.e. :
string _exists = "Adults,Men,Women,Boys";
string _check = "Men,Women,Boys,Adults,fail";
var checks = _check.Split(',');
var exists = _exists.Split(',');
if(checks.Length != exists.Length)
return false;
Array.Sort(checks);
Array.Sort(exists);
if (checks.SequenceEqual(exists))
return true;
return false;
Obviously these optimizations are useful only if your strings are really long, otherwise you can simply go with the LINQ one-liner.
try
return (_exists.Length == _check.Length);
That will check if the string arrays are the same length, but not necessarily the same values.
If you want to compare the arrays to see if they are exactly the same you will need to do the above first, then most likely sort the arrays into A-Z order, and compare each element
NOTE: This is unnecessary...
if (_exists.All(s => _check.Contains(s))) //tried Equal
{
return true;
}
else
{
return false;
}
...you can do this, and it's more elegant...
return (_exists.All(s => _check.Contains(s)));
If you want to see if the number of substrings separated by a comma is the same, then use this.
public bool StringsHaveSameNumberOfSubstring(string _exists, string _check)
{
return (_exists.Split(',').Length == _check.Split(',').Length);
}
This is what I understood from your question.
Split the strings to make two list, and later compare them using Linq to Objects
string _exists = "Adults,Men,Women,Boys";
string _check = "Men,Women,Boys,Adults,fail";
List<string> exists = new List<string>(_exists.Split(new char[] { ',' }));
List<string> check = new List<string>(_check.Split(new char[] { ',' }));
foreach(string toCheck in check){
if(exists.Contains(toCheck)){
//do things
}
}
If you just want to count strings try:
bool sameAmountOfStrings = _exists.Count(c => c.Equals(',')) == _check.Count(c => c.Equals(,));
First of all you need to split the strings to get arrays and sort them
var ary1 = _existing.Split(',').Trim().OrderBy(x => x);
var ary2 = _check.Split(',').Trim().OrderBy(x => x);
Now you can use 'SequenceEquals' to compare the Enumerables
var result = ary1.SequenceEquals(ary2);
SeqenceEquals compares the position and value, so if you want to detect positional changes as well, remoce the OrderBy.
Is there an easy way to convert a string from csv format into a string[] or list?
I can guarantee that there are no commas in the data.
String.Split is just not going to cut it, but a Regex.Split may - Try this one:
using System.Text.RegularExpressions;
string[] line;
line = Regex.Split( input, ",(?=(?:[^\"]*\"[^\"]*\")*(?![^\"]*\"))");
Where 'input' is the csv line. This will handle quoted delimiters, and should give you back an array of strings representing each field in the line.
If you want robust CSV handling, check out FileHelpers
string[] splitString = origString.Split(',');
(Following comment not added by original answerer)
Please keep in mind that this answer addresses the SPECIFIC case where there are guaranteed to be NO commas in the data.
Try:
Regex rex = new Regex(",(?=([^\"]*\"[^\"]*\")*(?![^\"]*\"))");
string[] values = rex.Split( csvLine );
Source: http://weblogs.asp.net/prieck/archive/2004/01/16/59457.aspx
You can take a look at using the Microsoft.VisualBasic assembly with the
Microsoft.VisualBasic.FileIO.TextFieldParser
It handles CSV (or any delimiter) with quotes. I've found it quite handy recently.
There isn't a simple way to do this well, if you want to account for quoted elements with embedded commas, especially if they are mixed with non-quoted fields.
You will also probably want to convert the lines to a dictionary, keyed by the column name.
My code to do this is several hundred lines long.
I think there are some examples on the web, open source projects, etc.
Try this;
static IEnumerable<string> CsvParse(string input)
{
// null strings return a one-element enumeration containing null.
if (input == null)
{
yield return null;
yield break;
}
// we will 'eat' bits of the string until it's gone.
String remaining = input;
while (remaining.Length > 0)
{
if (remaining.StartsWith("\"")) // deal with quotes
{
remaining = remaining.Substring(1); // pass over the initial quote.
// find the end quote.
int endQuotePosition = remaining.IndexOf("\"");
switch (endQuotePosition)
{
case -1:
// unclosed quote.
throw new ArgumentOutOfRangeException("Unclosed quote");
case 0:
// the empty quote
yield return "";
remaining = remaining.Substring(2);
break;
default:
string quote = remaining.Substring(0, endQuotePosition).Trim();
remaining = remaining.Substring(endQuotePosition + 1);
yield return quote;
break;
}
}
else // deal with commas
{
int nextComma = remaining.IndexOf(",");
switch (nextComma)
{
case -1:
// no more commas -- read to end
yield return remaining.Trim();
yield break;
case 0:
// the empty cell
yield return "";
remaining = remaining.Substring(1);
break;
default:
// get everything until next comma
string cell = remaining.Substring(0, nextComma).Trim();
remaining = remaining.Substring(nextComma + 1);
yield return cell;
break;
}
}
}
}
CsvString.split(',');
Get a string[] of all the lines:
string[] lines = System.IO.File.ReadAllLines("yourfile.csv");
Then loop through and split those lines (this error prone because it doesn't check for commas in quote-delimited fields):
foreach (string line in lines)
{
string[] items = line.Split({','}};
}
string s = "1,2,3,4,5";
string myStrings[] = s.Split({','}};
Note that Split() takes an array of characters to split on.
Some CSV files have double quotes around the values along with a comma. Therefore sometimes you can split on this string literal: ","
A Csv file with Quoted fields, is not a Csv file. Far more things (Excel) output without quotes rather than with quotes when you select "Csv" in a save as.
If you want one you can use, free, or commit to, here's mine that also does IDataReader/Record. It also uses DataTable to define/convert/enforce columns and DbNull.
http://github.com/claco/csvdatareader/
It doesn't do quotes.. yet. I just tossed it together a few days ago to scratch an itch.
Forgotten Semicolon: Nice link. Thanks.
cfeduke: Thanks for the tip to Microsoft.VisualBasic.FileIO.TextFieldParser. Going into CsvDataReader tonight.
http://github.com/claco/csvdatareader/ updated using TextFieldParser suggested by cfeduke.
Just a few props away from exposing separators/trimspaces/type ig you just need code to steal.
I was already splitting on tabs so this did the trick for me:
public static string CsvToTabDelimited(string line) {
var ret = new StringBuilder(line.Length);
bool inQuotes = false;
for (int idx = 0; idx < line.Length; idx++) {
if (line[idx] == '"') {
inQuotes = !inQuotes;
} else {
if (line[idx] == ',') {
ret.Append(inQuotes ? ',' : '\t');
} else {
ret.Append(line[idx]);
}
}
}
return ret.ToString();
}
string test = "one,two,three";
string[] okNow = test.Split(',');
separationChar[] = {';'}; // or '\t' ',' etc.
var strArray = strCSV.Split(separationChar);
string[] splitStrings = myCsv.Split(",".ToCharArray());