C# - Linq to combine (or) join two datatables into one - c#

I'm having problem in getting correct data from two datatables into one by using Linq in C#.
My datatables' data are coming from Excel file reading (not from DB).
I have tried below linq but return rows count is not what I want (my goal is to retrieve all data but for verification, I'm checking on row count so that I can know is it correct or not easily).
In dt1, I have 2645 records.
In dt2, I have 2600 records.
Return row count is 2600 (it looks like it is doing right join logic).
var v1 = from d1 in dt1.AsEnumerable()
from d2 in dt2.AsEnumerable()
.Where(x => x.Field<string>(X_ITEM_CODE) == d1.Field<string>(X_NO)
|| x.Field<string>(X_ITEM_KEY) == d1.Field<string>(X_NO))
select dt1.LoadDataRow(new object[]
{
// I use short cut way instead of Field<string> for testing purpose.
d1[X_NO],
d2[X_ITEM_CODE] == null ? "" : d2[X_ITEM_CODE] ,
d2[X_ITEM_KEY] == null ? "" : d2[X_ITEM_KEY],
d2[X_COSTS],
d2[X_DESC],
d2[X_QTY]== null ? 0 : dt[X_QTY]
}, false);
dt1 = v1.CopyToDataTable();
Console.WriteLine(dt1.Rows.Count);
I tried to use 'join' but my problem is the X_NO value can be either in X_ITEM_CODE or X_ITEM_KEY, so I can only put one condition in ON xxx equals yyy.
I would like to try 'join' if my above condition is suitable to use too. Please provide me some guide. Thanks.
[Additional Info]
I already tried foreach loop + dt1.Select(xxxx) + dt1.Rows.Add(xxx), it is working well but with around 2 minutes to complete the job.
I'm looking for a faster way and from above Linq code I tried, it seems faster than my foreach looping so I want to give Linq a chance.
For demo purpose, I only put a few columns in above example, my actual column count is 12 columns.
I afraid my post will become very long if I put on my foreach loop so I skip it when I post this question.
Anyway, below is the code and sample data. For those who can edit and think it is too long, kindly take out unnecessary/unrelated code or lines.
DataRow[] drs = null;
DataRow drO = null;
foreach (DataRow drY in dt2.Rows)
{
drs = null;
drs = dt1.Select(X_NO + "='" + drY[X_ITEM_KEY] + "' OR " + X_NO + "='" + drY[X_ITEM_CODE] + "'");
if (drs.Length >= 0)
{
// drs Leng will always 1 because no duplicate.
drs[0][X_ITEM_CODE] = drY[X_ITEM_CODE];
drs[0][X_ITEM_KEY] = drY[X_ITEM_KEY];
drs[0][X_COST] = clsD.GetInt(drY[X_COST]); // If null, return 0.
drs[0][X_DESC] = clsD.GetStr(drY[X_DESC]); // If null, return "".
drs[0][X_QTY] = clsD.GetInt(drY[X_QTY]);
}
else
{
// Not Found in ITEM CODE or KEY, add it.
drO = dtOutput.NewRow();
drO[X_ITEM_CODE] = drY[X_ITEM_CODE];
drO[X_ITEM_KEY] = drY[X_ITEM_KEY];
drO[X_COST] = clsD.GetInt(drY[X_COST]);
drO[X_DESC] = clsD.GetStr(drY[X_DESC]);
drO[X_QTY] = clsD.GetInt(drY[X_QTY]);
dt1.Rows.Add(drO);
}
}
// Note: For above else condition, I didn't put in my Linq testing yet.
// If without else condition, my dt1 will still have same record count.
[dt1 data]
X_NO,X_ITEM_CODE,X_ITEM_KEY,COST,DESC,QTY,....
AA060210A,,,,,,....
AB060220A,,,,,....
AC060230A,,,,,....
AD060240A,,,,,....
[dt2 data]
X_ITEM_CODE,X_ITEM_KEY,COST,DESC,QTY
AA060210A,AA060211A,100.00,PART1,10000
AB060221A,AB060220A,120.00,PART2,500
AC060232A,AC060230A,150.00,PART3,100
AD060240A,AD060243A,4.50,PART4,15250
[Update 2]
I tried below 'join' and it return nothing. So, can I assume join also will not help?
var vTemp1 = from d1 in dt1.AsEnumerable()
join d2 in dt2.AsEnumerable()
on 1 equals 1
where (d1[X_NO] == d2[X_ITEM_CODE] || d1[X_NO] == d2[X_ITEM_KEY])
select dt1.LoadDataRow(new object[]
{
d1[X_NO],
d2[X_ITEM_CODE] == null ? "" : d2[X_ITEM_CODE] ,
d2[X_ITEM_KEY] == null ? "" : d2[X_ITEM_KEY],
d2[X_COST],
d2[X_DESC],
d2[X_QTY]== null ? 0 : d2[X_QTY]
}, false);
Console.WriteLine(vTemp1.Count()); // return zero.

LINQ supports only equijoins, so apparently join operator cannot be used. But using LINQ query with Cartesian product and where will not give you any performance improvement.
What you really need (being LINQ or not) is a fast lookup by dt1[X_NO] field. Since as you said it is unique, you can build and use a dictionary for that:
var dr1ByXNo = dt1.AsEnumerable().ToDictionary(dr => dr.Field<string>(X_NO));
and then modify your process like this:
foreach (DataRow drY in dt2.Rows)
{
if (dr1ByXNo.TryGetValue(drY.Field<string>(X_ITEM_KEY), out dr0) ||
dr1ByXNo.TryGetValue(drY.Field<string>(X_ITEM_CODE), out dr0))
{
dr0[X_ITEM_CODE] = drY[X_ITEM_CODE];
dr0[X_ITEM_KEY] = drY[X_ITEM_KEY];
dr0[X_COST] = clsD.GetInt(drY[X_COST]); // If null, return 0.
dr0[X_DESC] = clsD.GetStr(drY[X_DESC]); // If null, return "".
dr0[X_QTY] = clsD.GetInt(drY[X_QTY]);
}
else
{
// Not Found in ITEM CODE or KEY, add it.
drO = dtOutput.NewRow();
drO[X_ITEM_CODE] = drY[X_ITEM_CODE];
drO[X_ITEM_KEY] = drY[X_ITEM_KEY];
drO[X_COST] = clsD.GetInt(drY[X_COST]);
drO[X_DESC] = clsD.GetStr(drY[X_DESC]);
drO[X_QTY] = clsD.GetInt(drY[X_QTY]);
dt1.Rows.Add(drO);
}
}
Since you are adding new records to the dt1 during the process, depending of your requirements you might need to add at the end of the else (after dt1.Rows.Add(drO); line) the following
dr1ByXNo.Add(dr0.Field<string>(X_NO), dr0);
I didn't include it because I don't see your code setting the new record X_NO field, so the above will produce duplicate key exception.

Related

How to find exact matching number from comma separated string in LINQ

I have a column in database, it contains error codes like "2,3,7,5,6,17" and I have taken it as a string and I want to exact match digit using linq.
For example I want to find match 7 (This number is taking dynamically) inside this string then it will return true. But the problem is, string also contain 17 so its fetching row which contains 17 also even if 7 is not present. I have used Contains() for that. Kindly give me the exact solution of it so I can return rows where we got exact match inside this string.
Note: This string I am fetching from the Database using LINQ
Image to show Rows fetched from db
In this image you can see the number 17 row also fetched but it doesn't contain 7. I want only for the 7 (exact match only)
Code that I have used
List<TableModel> result = from DN in dbContext.Example where DN.ErrorType.Contains(ErrorTypetxt).Select({})
DN.ErrorType is a complete string fetched from the db i.e 2,3,7,5,6,17
ErrorTypetxt is a 7 that I want to exact match..
How to resolve this?
var num = 7;
var start = $"{num},";
var end = $",{num}";
var mid = $",{num},";
var same = $"{num}";
var query =
from dn in source
where
dn.ErrorType.StartsWith(start) ||
dn.ErrorType.Contains(mid) ||
dn.ErrorType.EndsWith(end) ||
dn.ErrorType == same
select dn;
Or
var num = 7;
var mid = $",{num},";
var query =
from dn in source
let modified = "," + dn.ErrorType + ","
where modified.Contains(mid)
select dn;
var expected = "7";
var queryResult = "1,2,17,7";
var splitResult = queryResult.Split(',');
bool result = !string.IsNullOrEmpty(splitResult.FirstOrDefault(s => string.Equals(s, expected)));
//returns true
You can solve this using split first and then use the contain.
Please check this
https://dotnetfiddle.net/80k13h
Update after the first comment:
use this structure
TestTables
.Where(x =>
("," + x.Subjects + ",").Contains(",7,") ||
(x.Subjects.EndsWith("7") && x.Subjects.Length == 1) ||
(x.Subjects.StartsWith("7,") && x.Subjects.Length == 2) ||
(x.Subjects.Contains(",7,") && x.Subjects.Length > 2 &&
x.Subjects[0] != '7' && x.Subjects[x.Subjects.Length - 1] != '7')
)
.ToList()
This generates the following SQL in linqpad as give the expected result.

LINQ does not recognize joined table in select clause

I am left joining expected records to a returned set and attempting to determine if the expected column was updated correctly. The Column to be updated is determined by a string in the expected row.
Problem: I have a compile error I don't understand.
Cannot resolve symbol dbRow
(where bold/ bracketed by ** in QtyUpdated field).
var x = from addRow in expected.AsEnumerable()
join dbRow in dtDB.AsEnumerable()
on
new { key1= addRow[0], key2=addRow[1] ,key3=addRow[3] }
equals
new { key1=dbRow["TransactionID"],
key2=dbRow["TransactionItemID"],
key3=dbRow["DeliverDate"]
}
into result
from r in result.DefaultIfEmpty()
select new {TID = addRow[0], ItemID = addRow[1], DeliveryDate= addRow[3],
QtyUpdated= (
addRow[6].ToString() == "Estimated" ? **dbRow**["EstimatedQuantity"] == (decimal)addRow[5] :
addRow[6].ToString() == "Scheduled" ? **dbRow**["ScheduledQuantity"]==(decimal)addRow[5] :
addRow[6].ToString() == "Actual" ? **dbRow**["ActualQuantity"]== (decimal)addRow[5] : false)
};
I know this seems wonky, its a tool for Q/A to test that the Add function in our API actually worked.
Yes, dbRow is only in scope within the equals part of the join. However, you're not using your r range variable - which contains the matched rows for the current addRow... or null.
Just change each dbRow in the select to r. But then work out what you want to happen when there aren't any matched rows, so r is null.

how could i handle this linq query and compare it with and int in an if-statement?

i have the following code:
int selectedcourseId = Convert.ToInt32(c1.Text);
var cid = (from g in re.Sections where g.CourseID == selectedcourseId select g.CourseID);
int selectedinstructorid = Convert.ToInt32(c2.Text);
var iid = (from u in re.Sections where u.InstructorID == selectedinstructorid select u.InstructorID);
i want to compare the two (selectedcourseId) with (cid) and (selectedinstructorid) with (iid) in if-statement such as:
if (selectedcourseId = cid && selectedinstructorid = iid)
{
MessageBox.Show("it already exists");
}
i have tried many things that didnt work our because i have limited knowledge.
thank you very much in advance for any comment or answer
You can change your code as: (but it is meaningless for your situation to check this)
if (selectedcourseId == cid.First() && selectedinstructorid == iid.First())
First of all for checking equality in if statement you must use ==, not =. And the second is the IQueryable<T> allows you to execute a query against a specific data source, but it uses deferred execution. For executing it in your case, you can use First().
But, I suggest that you are just learning how to use LINQ and therefore you have written this code.
I don't know what you are trying to achive. But, if you want to search if there is any result with that ID's, the you must use Any():
var result1 = from g in re.Sections where g.CourseID == selectedcourseId select g.CourseID;
var result2 = from u in re.Sections where u.InstructorID == selectedinstructorid select u.InstructorID;
if(result1.Any() && result2.Any()) { ... }
Or, if you want to find if there is any row which has specified CourseID and InstructorID, then you can call one Any():
if(re.Sections.Any(x => x.CourseID == selectedcourseId && x.InstructorID == selectedinstructorid))
{ ... }
Let me try to find the X from this XY-problem. I guess you want to check if there is already a combination of courseid + instructorid. Then use a single query:
var data = from section in re.Sections
where section.InstructorID == selectedinstructorid
&& section.CourseID == selectedcourseId
select section;
if(data.Any())
{
MessageBox.Show("it already exists");
}
You should not do it in two queries, because the two results might be related to two different rows. This would lead to "false positives" when an instructor handles some section, and a course has some instructors, but the two matches do not belong to the same row:
course instructor
------ ----------
100 10
101 15
102 20
If you are looking for a combination (101, 10) it is not enough to see that 100 is present and 10 is present; you need to check that the two belong to the same row in order to consider it a duplicate.
You can fix this by making a "check presence" query, like this:
var existing = re.Sections
.Any(s => s.InstructorID == selectedinstructorid && s.CourseID == selectedcourseId);
if (existing) {
MessageBox.Show("it already exists");
}
if (selectedcourseId = cid && selectedinstructorid = iid)
this will not work, since single '=' is an assignment, not a comparation (which is '==')
also, you can try to do something like this
var cid = (int)((from g in re.Sections where g.CourseID == selectedcourseId select g.CourseID).FirstOrDefault());
so you select the first or default record from your list and cast it to int

Group by and find MAx value in a DataTable rows

I have existing code that works very well and it finds the maximum value of a data column in the data table. Now I would like to refine this and find the maximum value per empid.
What change would be needed? I do not want to use LINQ.
I am right now using this: memberSelectedTiers.Select("Insert_Date = MAX(Insert_Date)")
and I need to group it by Empid.
My code is as below.
DataTable memberApprovedTiers = GetSupplierAssignedTiersAsTable(this.Customer_ID, this.Contract_ID);
//get row with maximum Insert_Date in memberSelectedTiers
DataRow msRow = null;
if (memberSelectedTiers != null && memberSelectedTiers.Rows != null && memberSelectedTiers.Rows.Count > 0)
{
DataRow[] msRows = memberSelectedTiers.Select("Insert_Date = MAX(Insert_Date)");
if (msRows != null && msRows.Length > 0)
{
msRow = msRows[0];
}
}
You can use LINQ to achieve this. I think the following will work (don't have VS to test):
var grouped = memberSelectedTiers.AsEnumerable()
.GroupBy(r => r.Field<int>("EmpId"))
.Select(grp =>
new {
EmpId = grp.Key
, MaxDate = grp.Max(e => e.Field<DateTime>("Insert_Date"))
});
Daniel Kelley, your answer helped me and that's great, but did you notice the OP stated he didn't want to use LINQ?

Using Intersect I'm getting a Local sequence cannot be used in LINQ to SQL implementations of query operators except the Contains operator

I'm using a Linq to SQL query to provide a list of search term matches against a database field. The search terms are an in memory string array. Specifically, I'm using an "intersect" within the Linq query, comparing the search terms with a database field "Description". In the below code, the description field is iss.description. The description field is separated into an array within the Linq query and the intersect is used to compare the search terms and description term to keep all of the comparing and conditions within the Linq query so that the database is not taxed. In my research, trying o overcome the problem, I have found that the use of an in-memory, or "local" sequence is not supported. I have also tried a few suggestions during my research, like using "AsEnumerable" or "AsQueryable" without success.
searchText = searchText.ToUpper();
var searchTerms = searchText.Split(' ');
var issuesList1 = (
from iss in DatabaseConnection.CustomerIssues
let desc = iss.Description.ToUpper().Split(' ')
let count = desc.Intersect(searchTerms).Count()
where desc.Intersect(searchTerms).Count() > 0
join stoi in DatabaseConnection.SolutionToIssues on iss.IssueID equals stoi.IssueID into stoiToiss
from stTois in stoiToiss.DefaultIfEmpty()
join solJoin in DatabaseConnection.Solutions on stTois.SolutionID equals solJoin.SolutionID into solutionJoin
from solution in solutionJoin.DefaultIfEmpty()
select new IssuesAndSolutions
{
IssueID = iss.IssueID,
IssueDesc = iss.Description,
SearchHits = count,
SolutionDesc = (solution.Description == null)? "No Solutions":solution.Description,
SolutionID = (solution.SolutionID == null) ? 0 : solution.SolutionID,
SolutionToIssueID = (stTois.SolutionToIssueID == null) ? 0 : stTois.SolutionToIssueID,
Successful = (stTois.Successful == null)? false : stTois.Successful
}).ToList();
...
The only way I have been successful is to create two queries and calling a method as shown below, but this requires the Linq Query to return all of the matching results (with the number of hits for search terms in the description) including the non-matched records and provide an in-memory List<> and then use another Linq Query to filter out the non-matched records.
public static int CountHits(string[] searchTerms, string Description)
{
int hits = 0;
foreach (string item in searchTerms)
{
if (Description.ToUpper().Contains(item.Trim().ToUpper())) hits++;
}
return hits;
}
public static List<IssuesAndSolutions> SearchIssuesAndSolutions(string searchText)
{
using (BYCNCDatabaseDataContext DatabaseConnection = new BYCNCDatabaseDataContext())
{
searchText = searchText.ToUpper();
var searchTerms = searchText.Split(' ');
var issuesList1 = (
from iss in DatabaseConnection.CustomerIssues
join stoi in DatabaseConnection.SolutionToIssues on iss.IssueID equals stoi.IssueID into stoiToiss
from stTois in stoiToiss.DefaultIfEmpty()
join solJoin in DatabaseConnection.Solutions on stTois.SolutionID equals solJoin.SolutionID into solutionJoin
from solution in solutionJoin.DefaultIfEmpty()
select new IssuesAndSolutions
{
IssueID = iss.IssueID,
IssueDesc = iss.Description,
SearchHits = CountHits(searchTerms, iss.Description),
SolutionDesc = (solution.Description == null)? "No Solutions":solution.Description,
SolutionID = (solution.SolutionID == null) ? 0 : solution.SolutionID,
SolutionToIssueID = (stTois.SolutionToIssueID == null) ? 0 : stTois.SolutionToIssueID,
Successful = (stTois.Successful == null)? false : stTois.Successful
}).ToList();
var issuesList = (
from iss in issuesList1
where iss.SearchHits > 0
select iss).ToList();
...
I would be comfortable with two Linq Queries, but with the first Linq Query only returning the matched records and then maybe using a second, maybe lambda expression to order them, but my trials have not been successful.
Any help would be most appreciated.
Ok, so after more searching more techniques, and trying user1010609's technique, I managed to get it working after an almost complete rewrite. The following code first provides a flat record query with all of the information I am searching, then a new list is formed with the filtered information compared against the search terms (counting the hits of each search term for ordering by relevance). I was careful not to return a list of the flat file so there would be some efficiency in the final database retrieval (during the formation of the filtered List<>). I am positive this is not even close to being an efficient method, but it works. I am eager to see more and unique techniques to solving this type of problem. Thanks!
searchText = searchText.ToUpper();
List<string> searchTerms = searchText.Split(' ').ToList();
var allIssues =
from iss in DatabaseConnection.CustomerIssues
join stoi in DatabaseConnection.SolutionToIssues on iss.IssueID equals stoi.IssueID into stoiToiss
from stTois in stoiToiss.DefaultIfEmpty()
join solJoin in DatabaseConnection.Solutions on stTois.SolutionID equals solJoin.SolutionID into solutionJoin
from solution in solutionJoin.DefaultIfEmpty()
select new IssuesAndSolutions
{
IssueID = iss.IssueID,
IssueDesc = iss.Description,
SolutionDesc = (solution.Description == null) ? "No Solutions" : solution.Description,
SolutionID = (solution.SolutionID == null) ? 0 : solution.SolutionID,
SolutionToIssueID = (stTois.SolutionToIssueID == null) ? 0 : stTois.SolutionToIssueID,
Successful = (stTois.Successful == null) ? false : stTois.Successful
};
List<IssuesAndSolutions> filteredIssues = new List<IssuesAndSolutions>();
foreach (var issue in allIssues)
{
int hits = 0;
foreach (var term in searchTerms)
{
if (issue.IssueDesc.ToUpper().Contains(term.Trim())) hits++;
}
if (hits > 0)
{
IssuesAndSolutions matchedIssue = new IssuesAndSolutions();
matchedIssue.IssueID = issue.IssueID;
matchedIssue.IssueDesc = issue.IssueDesc;
matchedIssue.SearchHits = hits;
matchedIssue.CustomerID = issue.CustomerID;
matchedIssue.AssemblyID = issue.AssemblyID;
matchedIssue.DateOfIssue = issue.DateOfIssue;
matchedIssue.DateOfResolution = issue.DateOfResolution;
matchedIssue.CostOFIssue = issue.CostOFIssue;
matchedIssue.ProductID = issue.ProductID;
filteredIssues.Add(matchedIssue);
}
}

Categories

Resources