Issue with data table Select statement

Issue with data table Select statement - c#

The following VB line, where _DSversionInfo is a DataSet, returns no rows:
_DSversionInfo.Tables("VersionInfo").Select("FileID=88")
but inspection shows that the table contains rows with FileID's of 92, 93, 94, 90, 88, 89, 215, 216. The table columns are all of type string.
Further investigation showed that using the ID of 88, 215 and 216 will only return rows if the number is quoted.
ie _DSversionInfo.Tables("VersionInfo").Select("FileID='88'")
All other rows work regardless of whether the number is quoted or not.
Anyone got an explanation of why this would happen for some numbers but not others? I understand that the numbers should be quoted just not why some work and others don't?
I discovered this in some VB.NET code but (despite my initial finger pointing) don't think it is VB.NET specific.

According to the MSDN documentation on building expressions, strings should always be quoted. Failing to do so produces some bizarro unpredictable behavior... You should quote your number strings to get predictable and proper behavior like the documentation says.
I've encounted what you're describing in the past, and kinda tried to figure it out - here, pop open your favorite .NET editor and try the following:
Create a DataTable, and into a string column 'Stuff' of that DataSet, insert rows in the following order: "6", "74", "710", and Select with the filter expression "Stuff = 710". You will get 1 row back. Now, change the first row into any number greater than 7 - suddenly, you get 0 rows back.
As long as the numbers are ordered in proper descending order using string ordering logic (i.e., 7 comes after 599) the unquoted query appears to work.
My guess is that this is a limitation of how DataSet filter expressions are parsed, and it wasn't meant to work this way...
The Code:
// Unquoted filter string bizzareness.
var table = new DataTable();
table.Columns.Add(new DataColumn("NumbersAsString", typeof(String)));
var row1 = table.NewRow(); row1["NumbersAsString"] = "9"; table.Rows.Add(row1); // Change to '66
var row2 = table.NewRow(); row2["NumbersAsString"] = "74"; table.Rows.Add(row2);
var row4 = table.NewRow(); row4["NumbersAsString"] = "90"; table.Rows.Add(row4);
var row3 = table.NewRow(); row3["NumbersAsString"] = "710"; table.Rows.Add(row3);
var results = table.Select("NumbersAsString = 710"); // Returns 0 rows.
var results2 = table.Select("NumbersAsString = 74"); // Throws exception "Min (1) must be less than or equal to max (-1) in a Range object." at System.Data.Select.GetBinaryFilteredRecords()
Conclusion: Based on the exception text in that last case, there appears to be some wierd casting going on inside filter expressions that is not guaranteed to be safe. Explicitely putting single quotes around the value for which you're querying avoids this problem by letting .NET know that this is a literal.

DataTable builds an index on the columns to make Select() queries fast. That index is sorted by value, then it uses a binary search to select the range of records that matches the query expression.
So the records will be sorted like this 215,216,88,89,90,92,93,94. A binary search is done treating them as integer (as per our filter expression) cannot locate certain records because, it is designed to only search properly sorted collections.
It indexes the data as string and Binary search searches as number. See the below explanation.
string[] strArr = new string[] { "115", "118", "66", "77", "80", "81", "82" };
int[] intArr = new int[] { 215, 216, 88, 89, 90, 92, 93, 94 };
int i88 = Array.BinarySearch(intArr, 88); //returns -ve index
int i89 = Array.BinarySearch(intArr, 89); //returns +ve index
This should be a bug in the framework.

this error usually comes due to invalid data table column type in which you are going to search
i got this error when i was using colConsultDate instead of Convert(colConsultDate, 'System.DateTime')
because colConsultDate was a data table column of type string which i must have to convert into System.DateTime therefor your search query should be like
string query = "Convert(colConsultDate, 'System.DateTime') >= #" + sdateDevFrom.ToString("MM/dd/yy") + "# AND Convert(colConsultDate, 'System.DateTime') <= #" + sdateDevTo.ToString("MM/dd/yy") + "#";
DataRow[] dr = yourDataTable.Select(query);
if (dr.Length > 0)
{
nextDataTabel = dr.CopyToDataTable();
}

#Val Akkapeddi just wanna add things to your answer.
if you do something like this it would be benefited specially when you have to use comparison operators. because you put quotes around 74 it will be treated as string. please see yourself by actually writing code. Comparison operators
(decimal is just for reference you can add your desired datatype instead.)
var results2 = table.Select("Convert(NumbersAsString , 'System.Decimal') = 74.0")

Related

Reading TCURR table with RFC_READ_TABLE truncates the rate value

I'm trying to read data from SAP ECC using Microsoft .NET. For this, I am using the SAP Connector for Microsoft .NET 3.0 Following is the code to retrieve the data, I'm getting the results too. However, I found that the exchange rate value is having a * if it exceeds 7 characters.
ECCDestinationConfig cfg = new ECCDestinationConfig();
RfcDestinationManager.RegisterDestinationConfiguration(cfg);
RfcDestination dest = RfcDestinationManager.GetDestination("mySAPdestination");
RfcRepository repo = dest.Repository;
IRfcFunction testfn = repo.CreateFunction("RFC_READ_TABLE");
testfn.SetValue("QUERY_TABLE", "TCURR");
// fields will be separated by semicolon
testfn.SetValue("DELIMITER", ";");
// Parameter table FIELDS contains the columns you want to receive
// here we query 3 fields, FCURR, TCURR and UKURS
IRfcTable fieldsTable = testfn.GetTable("FIELDS");
fieldsTable.Append();
fieldsTable.SetValue("FIELDNAME", "FCURR");
fieldsTable.Append();
fieldsTable.SetValue("FIELDNAME", "TCURR");
fieldsTable.Append();
fieldsTable.SetValue("FIELDNAME", "UKURS");
fieldsTable.Append();
fieldsTable.SetValue("FIELDNAME", "GDATU");
// the table OPTIONS contains the WHERE condition(s) of your query
// several conditions have to be concatenated in ABAP syntax, for instance with AND or OR
IRfcTable optsTable = testfn.GetTable("OPTIONS");
var dateVal = 99999999 - 20190701;
optsTable.Append();
optsTable.SetValue("TEXT", "gdatu = '" + dateVal + "' and KURST = 'EURX'");
testfn.Invoke(dest);
Values are as follows:
How to get the full value without any truncation?

You just ran into the worst limitation of RFC_READ_TABLE.
Its error is to return field values based on internal length and truncating the rest, rather than using the output length. TCURR-UKURS is a BCD decimal packed field of length 9,5 (9 bytes = 17 digits, including 5 digits after the decimal point) and an output length of 12. Unfortunately, RFC_READ_TABLE outputs the result on 9 characters, so a value of 105.48000- takes 10 characters is too long, so ABAP default logic is to set the * overflow character on the leftmost character (*5.48000-).
Either you create another RFC-enabled function module at SAP/ABAP side, or you access directly the SAP database (classic RDBMS connected to SAP server).

Just an addition to Sandra perfect explanation about this issue. Yes, the only solution here would be writing a custom module for fetching remote records.
If you don't want to rewrite it from scratch the simplest solution would be to copy RFC_READ_TABLE into Z module and change line 137
FIELDS_INT-LENGTH_DST = TABLE_STRUCTURE-LENG.
to
FIELDS_INT-LENGTH_DST = TABLE_STRUCTURE-OUTPUTLEN.
This solves the problem.
UPDATE: try BAPI_EXCHANGERATE_GETDETAIL BAPI, it is RFC-enabled and reads rates correctly. The interface is quite self-explanatory, the only difference is that date should be in native format, not in inverted:
CALL FUNCTION 'BAPI_EXCHANGERATE_GETDETAIL'
EXPORTING
rate_type = 'EURO'
from_curr = 'USD'
to_currncy = 'EUR'
date = '20190101'
IMPORTING
exch_rate = rates
return = return.

Use BBP_RFC_READ_TABLE. It is still not the best but it does one thing right which RFC_READ_TABLE did not: one additional byte for the decimal sign.
No need to go through all the ordeal if you only look for patching the decimal issue.

This is the sample code used with SAP connector for .NET, let it be helpful for someone who looks for the same. Thanks for all those who helped.
var RateForDate = 20190701;
ECCDestinationConfig cfg = new ECCDestinationConfig();
RfcDestinationManager.RegisterDestinationConfiguration(cfg);
RfcDestination dest = RfcDestinationManager.GetDestination("mySAPdestination");
RfcRepository repo = dest.Repository;
IRfcFunction sapFunction = repo.CreateFunction("RFC_READ_TABLE");
sapFunction.SetValue("QUERY_TABLE", "TCURR");
// fields will be separated by semicolon
sapFunction.SetValue("DELIMITER", ";");
// Parameter table FIELDS contains the columns you want to receive
// here we query 3 fields, FCURR, TCURR and UKURS
IRfcTable fieldsTable = sapFunction.GetTable("FIELDS");
fieldsTable.Append();
fieldsTable.SetValue("FIELDNAME", "FCURR");
//fieldsTable.Append();
//fieldsTable.SetValue("FIELDNAME", "TCURR");
//fieldsTable.Append();
//fieldsTable.SetValue("FIELDNAME", "UKURS");
// the table OPTIONS contains the WHERE condition(s) of your query
// here a single condition, KUNNR is to be 0012345600
// several conditions have to be concatenated in ABAP syntax, for instance with AND or OR
IRfcTable optsTable = sapFunction.GetTable("OPTIONS");
var dateVal = 99999999 - RateForDate;
optsTable.Append();
optsTable.SetValue("TEXT", "gdatu = '" + dateVal + "' and KURST = 'EURX'");
sapFunction.Invoke(dest);
var companyCodeList = sapFunction.GetTable("DATA");
DataTable Currencies = companyCodeList.ToDataTable("DATA");
//Add additional column for rates
Currencies.Columns.Add("Rate", typeof(double));
//------------------
sapFunction = repo.CreateFunction("BAPI_EXCHANGERATE_GETDETAIL");
//rate type of your system
sapFunction.SetValue("rate_type", "EURX");
sapFunction.SetValue("date", RateForDate.ToString());
//Main currency of your system
sapFunction.SetValue("to_currncy", "EUR");
foreach (DataRow item in Currencies.Rows)
{
sapFunction.SetValue("from_curr", item[0].ToString());
sapFunction.Invoke(dest);
IRfcStructure impStruct = sapFunction.GetStructure("EXCH_RATE");
item["Rate"] = impStruct.GetDouble("EXCH_RATE_V");
}
dtCompanies.DataContext = Currencies;
RfcDestinationManager.UnregisterDestinationConfiguration(cfg);

Date table column Sum and differencer

i need to make a data table like that:
Subjects old new diff
Sub_1 10 50 40
Sub_2 30 10 -20
total 40 60 20
this is a part of code
DataTable subjects = new DataTable();
subjects.Columns.Add("Subjects");
subjects.Columns.Add("old");
subjects.Columns.Add("new");
subjects.Columns.Add("diff");
subjects.Rows.Add("Sub_1", sub1.Old, sub1.New, (sub1.New - sub1.Old));
subjects.Rows.Add("Sub_2", sub2.Old, sub2.New, (sub2.New - sub2.Old));
subjects.Rows.Add("Total", .. total of above .. , .. total of above .., .. total of above ..);
so i need to ask how to calculate the total value of last column ( Total) , and is there is any other way to calculate the 4th coulmn ( 3rdcol - 2 2nd col )

First of all, you have declared your columns without DataType(thanks to #Steve for comment). So, please change Add() methods as:
subjects.Columns.Add("old", typeof(Int32));
subjects.Columns.Add("new", typeof(Int32));
Also, you can set the value for diff column like this:
subjects.Columns.Add("diff", typeof(Int32), "new - old");
And, then remove any other calculations in Rows.Add method:
subjects.Rows.Add("Sub_1", sub1.Old, sub1.New);
subjects.Rows.Add("Sub_2", sub2.Old, sub2.New);
And then you can use DataTable.Compute(string expression, string filter) method. It computes the given expression on the current rows that pass the filter criteria.
In your case expression will be Sum(columnName) and the filter will be empty string, because you don't need any filter.
subjects.Compute("Sum(old)", "")
So, change your code as:
subjects.Rows.Add("Total",
subjects.Compute("Sum(old)", ""),
subjects.Compute("Sum(new)", ""),
subjects.Compute("Sum(diff)", ""));

Stringbuilder and DataAdapter.Update duplicating lines

I have a DataTable similar to
Col1 Col2 Col3 Col4 Col5 Date
21 22 23 24 25 7/25/2014 12:00:00 AM
31 32 33 34 35 7/25/2014 12:00:00 AM
11 12 13 14 15 7/25/2014 12:00:00 AM
and I loop through it as
StringBuilder output = new StringBuilder("Col1\tCol2\tCol3\tCol4\tCol5\tDate\n");
foreach(DataRow row in partTable.Select()) {
output.AppendLine(String.Join("\t", row.ItemArray.ToArray()));
}
and do some formatting using StringBuilder.Replace on output. I print the result to a MessageBox and it duplicates my rows. The first time I call this it prints 2 copies, the next it prints 3, etc. (After one call.) I have checked repeatedly that the table is correct and doesn't contain duplicates. Below is the full code for this function.
private void printTable() {
updateDataSet();
if (partTable.Rows.Count == 0) {
MessageBox.Show("Table is empty.", "Table");
return;
}
StringBuilder output = new StringBuilder("Col1\tCol2\tCol3\tCol4\tCol5\tDate\n");
foreach(DataRow row in partTable.Select()) {
output.AppendLine(String.Join("\t", row.ItemArray.ToArray()));
}
// Get rid of time and type
output.Replace("12:00:00 AM", "");
output.Replace("W\t", "");
MessageBox.Show(output.ToString(), "Table");
output.Clear();
}
Solution Implemented: Commenting out updateDataSet() removes the duplication. I guess I just need to try to read MSDN more carefully... Replaced Fill with Update, but it would not remove any rows I deleted. Used a combination of Clear and Fill to get an updated table without recreating the connection.

If you call two times the DataAdapter.Fill(DataTable) method you double the records present in your datatable. To avoid this behaviour you need to write (inside the updateDataSet() method)
OleDbDataAdapter da = new OleDbDataAdapter(.....)
DataTable dt = new DataTable();
da.MissingSchemaAction = MissingSchemaAction.AddWithKey;
da.Fill(dt);
// Just for testing
// Check these results with and without the MissingSchemaAction flag
Console.WriteLine(dt.Rows.Count);
da.Fill(dt);
Console.WriteLine(dt.Rows.Count);
Of course, the presence of MissingSchemaAction.AddWithKey results in poorer performances if you don't remove the cause of the second (or third call) to updateDataSet() Infact, in this scenario the loading method should check every row present to find duplicates.

Please put all your code for analysis.
Also, there is logic/design issue in the code which has been provided.
PrintTable method is expected to read and print data only. UpdateDataset is breaking intention of code.

How to identify proper substring length

I'm trying to read column values from this file starting at the arrow position:
Here's my error:
I'm guessing it's because the length values are wrong.
Say I have column with value :"Dog "
with the word dog and a few spaces after it. Do I have to set the length parameter as 3 (for dog) or can I set it as 6 to accommodate the spaces after Dog. This because each column length is fixed. As you can see some words are smaller than others and in order to be consistent I just want to set length as max column length (ex: 28 is length of 3rd column of my file but not all 28 spots are taken up everytime - ex: the word client is only 6 characters long

Robert Levy's answer is correct for the issue you're seeing - you've attempted to pull a substring from a string with a starting position that is greater than the length of the string.
You're parsing a fixed-length field file, where each field has a certain amount of characters, whether or not it uses all of them, and the pos and len arrays are intended to define those field lengths for use with Substring. As long as the line you're reading matches the expected field starts and lengths, you will be ok. As soon as you come to a line that doesn't match (for example, what appears to be the totals line - 0TotalRecords: 3,390,315) the field length definitions you've been using won't work, as the format has changed (and the line length may not even be the same).
There are a couple of things I would change to make this work. First, I would change your pos and len arrays so that they take the entirety of the field, not part of it. You can use Trim() to get rid of any leading or trailing blanks. As defined, your first field will only take the last number of the Seq# (pos 4, len 1), and your second field will only take the first 5 characters of the field, even though it appears to have space for ~12 characters.
Take a look at this (it's hard to be exact working from the picture, but for purposes of demonstration it will work):
1 2 3 4
01234567890123456789012345678901234567890
Seq# Field Description
3 BELNR ACCOUNTING DOCUMENT NBR
The numbers are the position of each charcter in the line. I would define the pos array to be the start of the field (0 for the first field, and then the position of the first letter of the field heading for each field after that), so you would have:
Seq# = 0
Field = 6
Description = 18
The len array would hold the length of the field, which I would define as the amount of characters up to the beginning of the next field, like this:
Seq# = 6
Field = 12
Description = 28 (using what you have as it is hard to tell
This would make your array initialization the following:
int[] pos = new int[3] { 0, 6, 18 };
int[] len = new int[3] { 6, 12, 28 };
If you wanted the fourth field, it would start at position 36 (pos 18 + len 28 = 36).
The second thing is I would check in the loop to see if the Total Records line is there, and skip that line (most likely it's the last line):
foreach (string line in textBox1.Lines)
{
if (!line.Contains("Total Records"))
{
val[j] = line.Substring(pos[j], len[j]).Trim();
}
}
Another way to do this would be to modify the original query and add a TakeWhile clause to it to only take lines until you hit the Total Records one:
string[] lines = File.ReadAllLines(ofd.FileName).Skip(8)
.TakeWhile(l => !l.Contains("Total Records")).ToArray();
The above would skip the first 8 lines and take all the remaining lines up to, but not including, the first line to contain "Total Records" in the string.
Then you could do something like this:
string[] lines = File.ReadAllLines(ofd.FileName).Skip(8)
.TakeWhile(l => !l.Contains("Total Records")).ToArray();
textBox1.Lines = lines;
int[] vale = new int[3];
int[] pos = new int[3] { 0, 6, 18 };
int[] len = new int[3] { 6, 12, 28 };
foreach (string line in textBox1.Lines)
{
val[j] = line.Substring(pos[j], len[j]).Trim();
}
Now you don't have to check for the "Total Records" line.
Of course, if there are other lines in your file, or there are records after the "Total Records" line (which I rather doubt) you'll have to handle those cases as well.
In short, the code for pulling out the substrings will only work for lines that match that particular format (or more specifically, have fields that match those positions/lengths) - anything outside out of that will either give you incorrect values or throw an error (if the start position is greater than the length of the string).

that exception is complaining about the first parameter which suggests that your file contains a row that is < 18 characters

Convert a file full of "INSERT INTO xxx VALUES" in to something Bulk Insert can parse

This is a followup to my first question "Porting “SQL” export to T-SQL".
I am working with a 3rd party program that I have no control over and I can not change. This program will export it's internal database in to a set of .sql each one with a format of:
INSERT INTO [ExampleDB] ( [IntField] , [VarcharField], [BinaryField])
VALUES
(1 , 'Some Text' , 0x123456),
(2 , 'B' , NULL),
--(SNIP, it does this for 1000 records)
(999, 'E' , null);
(1000 , 'F' , null);
INSERT INTO [ExampleDB] ( [IntField] , [VarcharField] , BinaryField)
VALUES
(1001 , 'asdg', null),
(1002 , 'asdf' , 0xdeadbeef),
(1003 , 'dfghdfhg' , null),
(1004 , 'sfdhsdhdshd' , null),
--(SNIP 1000 more lines)
This pattern continues till the .sql file has reached a file size set during the export, the export files are grouped by EXPORT_PATH\%Table_Name%\Export#.sql Where the # is a counter starting at 1.
Currently I have about 1.3GB data and I have it exporting in 1MB chunks (1407 files across 26 tables, All but 5 tables only have one file, the largest table has 207 files).
Right now I just have a simple C# program that reads each file in to ram then calls ExecuteNonQuery. The issue is I am averaging 60 sec/file which means it will take about 23 hrs for it to do the entire export.
I assume if I some how could format the files to be loaded with a BULK INSERT instead of a INSERT INTO it could go much faster. Is there any easy way to do this or do I have to write some kind of Find & Replace and keep my fingers crossed that it does not fail on some corner case and blow up my data.
Any other suggestions on how to speed up the insert into would also be appreciated.
UPDATE:
I ended up going with the parse and do a SqlBulkCopy method. It went from 1 file/min. to 1 file/sec.

Well, here is my "solution" for helping convert the data into a DataTable or otherwise (run it in LINQPad):
var i = "(null, 1 , 'Some''\n Text' , 0x123.456)";
var pat = #",?\s*(?:(?<n>null)|(?<w>[\w.]+)|'(?<s>.*)'(?!'))";
Regex.Matches(i, pat,
RegexOptions.IgnoreCase | RegexOptions.Singleline).Dump();
The match should be run once per value group (e.g. (a,b,etc)). Parsing of the results (e.g. conversion) is left to the caller and I have not tested it [much]. I would recommend creating the correctly-typed DataTable first -- although it may be possible to pass everything "as a string" to the database? -- and then use the information in the columns to help with the extraction process (possibly using type converters). For the captures: n is null, w is word (e.g. number), s is string.
Happy coding.

Apparently your data is always wrapped in parentheses and starts with a left parenthesis. You might want to use this rule to split(RemoveEmptyEntries) each of those lines and load it into a DataTable. Then you can use SqlBulkCopy to copy all at once into the database.
This approach would not necessarily be fail-safe, but it would be certainly faster.
Edit: Here's the way how you could get the schema for every table:
private static DataTable extractSchemaTable(IEnumerable<String> lines)
{
DataTable schema = null;
var insertLine = lines.SkipWhile(l => !l.StartsWith("INSERT INTO [")).Take(1).First();
var startIndex = insertLine.IndexOf("INSERT INTO [") + "INSERT INTO [".Length;
var endIndex = insertLine.IndexOf("]", startIndex);
var tableName = insertLine.Substring(startIndex, endIndex - startIndex);
using (var con = new SqlConnection("CONNECTION"))
{
using (var schemaCommand = new SqlCommand("SELECT * FROM " tableName, con))
{
con.Open();
using (var reader = schemaCommand.ExecuteReader(CommandBehavior.SchemaOnly))
{
schema = reader.GetSchemaTable();
}
}
}
return schema;
}
Then you simply need to iterate each line in the file, check if it starts with ( and split that line by Split(new[] { ',' }, StringSplitOptions.RemoveEmptyEntries). Then you could add the resulting array into the created schema-table.
Something like this:
var allLines = System.IO.File.ReadAllLines(path);
DataTable result = extractSchemaTable(allLines);
for (int i = 0; i < allLines.Length; i++)
{
String line = allLines[i];
if (line.StartsWith("("))
{
String data = line.Substring(1, line.Length - (line.Length - line.LastIndexOf(")")) - 1);
var fields = data.Split(new[] { ',' }, StringSplitOptions.RemoveEmptyEntries);
// you might need to parse it to correct DataColumn.DataType
result.Rows.Add(fields);
}
}

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Issue with data table Select statement - c#

Related

Reading TCURR table with RFC_READ_TABLE truncates the rate value

Date table column Sum and differencer

Stringbuilder and DataAdapter.Update duplicating lines

How to identify proper substring length

Convert a file full of "INSERT INTO xxx VALUES" in to something Bulk Insert can parse

Categories

Resources