Group by and find MAx value in a DataTable rows

Group by and find MAx value in a DataTable rows - c#

I have existing code that works very well and it finds the maximum value of a data column in the data table. Now I would like to refine this and find the maximum value per empid.
What change would be needed? I do not want to use LINQ.
I am right now using this: memberSelectedTiers.Select("Insert_Date = MAX(Insert_Date)")
and I need to group it by Empid.
My code is as below.
DataTable memberApprovedTiers = GetSupplierAssignedTiersAsTable(this.Customer_ID, this.Contract_ID);
//get row with maximum Insert_Date in memberSelectedTiers
DataRow msRow = null;
if (memberSelectedTiers != null && memberSelectedTiers.Rows != null && memberSelectedTiers.Rows.Count > 0)
{
DataRow[] msRows = memberSelectedTiers.Select("Insert_Date = MAX(Insert_Date)");
if (msRows != null && msRows.Length > 0)
{
msRow = msRows[0];
}
}

You can use LINQ to achieve this. I think the following will work (don't have VS to test):
var grouped = memberSelectedTiers.AsEnumerable()
.GroupBy(r => r.Field<int>("EmpId"))
.Select(grp =>
new {
EmpId = grp.Key
, MaxDate = grp.Max(e => e.Field<DateTime>("Insert_Date"))
});

Daniel Kelley, your answer helped me and that's great, but did you notice the OP stated he didn't want to use LINQ?

Related

How to select row from a table based on two values c#?

I know how to select a row from a table base on one column value, like that:
var rowFeatureName = db.AfrRules.SingleOrDefault(r => r.FeatureName == FeatureName);
But how can I do the same thing based on two column?
Thanks

Try to do an AND operation
var rowFeatureName = db.AfrRules.Where(r => r.FeatureName == FeatureName &&
r.FeatureName2 == FeatureName).FirstOrDefault();
or this for an OR operation
var rowFeatureName = db.AfrRules.Where(r => r.FeatureName == FeatureName ||
r.FeatureName2 == FeatureName).FirstOrDefault();

C# - Linq to combine (or) join two datatables into one

I'm having problem in getting correct data from two datatables into one by using Linq in C#.
My datatables' data are coming from Excel file reading (not from DB).
I have tried below linq but return rows count is not what I want (my goal is to retrieve all data but for verification, I'm checking on row count so that I can know is it correct or not easily).
In dt1, I have 2645 records.
In dt2, I have 2600 records.
Return row count is 2600 (it looks like it is doing right join logic).
var v1 = from d1 in dt1.AsEnumerable()
from d2 in dt2.AsEnumerable()
.Where(x => x.Field<string>(X_ITEM_CODE) == d1.Field<string>(X_NO)
|| x.Field<string>(X_ITEM_KEY) == d1.Field<string>(X_NO))
select dt1.LoadDataRow(new object[]
{
// I use short cut way instead of Field<string> for testing purpose.
d1[X_NO],
d2[X_ITEM_CODE] == null ? "" : d2[X_ITEM_CODE] ,
d2[X_ITEM_KEY] == null ? "" : d2[X_ITEM_KEY],
d2[X_COSTS],
d2[X_DESC],
d2[X_QTY]== null ? 0 : dt[X_QTY]
}, false);
dt1 = v1.CopyToDataTable();
Console.WriteLine(dt1.Rows.Count);
I tried to use 'join' but my problem is the X_NO value can be either in X_ITEM_CODE or X_ITEM_KEY, so I can only put one condition in ON xxx equals yyy.
I would like to try 'join' if my above condition is suitable to use too. Please provide me some guide. Thanks.
[Additional Info]
I already tried foreach loop + dt1.Select(xxxx) + dt1.Rows.Add(xxx), it is working well but with around 2 minutes to complete the job.
I'm looking for a faster way and from above Linq code I tried, it seems faster than my foreach looping so I want to give Linq a chance.
For demo purpose, I only put a few columns in above example, my actual column count is 12 columns.
I afraid my post will become very long if I put on my foreach loop so I skip it when I post this question.
Anyway, below is the code and sample data. For those who can edit and think it is too long, kindly take out unnecessary/unrelated code or lines.
DataRow[] drs = null;
DataRow drO = null;
foreach (DataRow drY in dt2.Rows)
{
drs = null;
drs = dt1.Select(X_NO + "='" + drY[X_ITEM_KEY] + "' OR " + X_NO + "='" + drY[X_ITEM_CODE] + "'");
if (drs.Length >= 0)
{
// drs Leng will always 1 because no duplicate.
drs[0][X_ITEM_CODE] = drY[X_ITEM_CODE];
drs[0][X_ITEM_KEY] = drY[X_ITEM_KEY];
drs[0][X_COST] = clsD.GetInt(drY[X_COST]); // If null, return 0.
drs[0][X_DESC] = clsD.GetStr(drY[X_DESC]); // If null, return "".
drs[0][X_QTY] = clsD.GetInt(drY[X_QTY]);
}
else
{
// Not Found in ITEM CODE or KEY, add it.
drO = dtOutput.NewRow();
drO[X_ITEM_CODE] = drY[X_ITEM_CODE];
drO[X_ITEM_KEY] = drY[X_ITEM_KEY];
drO[X_COST] = clsD.GetInt(drY[X_COST]);
drO[X_DESC] = clsD.GetStr(drY[X_DESC]);
drO[X_QTY] = clsD.GetInt(drY[X_QTY]);
dt1.Rows.Add(drO);
}
}
// Note: For above else condition, I didn't put in my Linq testing yet.
// If without else condition, my dt1 will still have same record count.
[dt1 data]
X_NO,X_ITEM_CODE,X_ITEM_KEY,COST,DESC,QTY,....
AA060210A,,,,,,....
AB060220A,,,,,....
AC060230A,,,,,....
AD060240A,,,,,....
[dt2 data]
X_ITEM_CODE,X_ITEM_KEY,COST,DESC,QTY
AA060210A,AA060211A,100.00,PART1,10000
AB060221A,AB060220A,120.00,PART2,500
AC060232A,AC060230A,150.00,PART3,100
AD060240A,AD060243A,4.50,PART4,15250
[Update 2]
I tried below 'join' and it return nothing. So, can I assume join also will not help?
var vTemp1 = from d1 in dt1.AsEnumerable()
join d2 in dt2.AsEnumerable()
on 1 equals 1
where (d1[X_NO] == d2[X_ITEM_CODE] || d1[X_NO] == d2[X_ITEM_KEY])
select dt1.LoadDataRow(new object[]
{
d1[X_NO],
d2[X_ITEM_CODE] == null ? "" : d2[X_ITEM_CODE] ,
d2[X_ITEM_KEY] == null ? "" : d2[X_ITEM_KEY],
d2[X_COST],
d2[X_DESC],
d2[X_QTY]== null ? 0 : d2[X_QTY]
}, false);
Console.WriteLine(vTemp1.Count()); // return zero.

LINQ supports only equijoins, so apparently join operator cannot be used. But using LINQ query with Cartesian product and where will not give you any performance improvement.
What you really need (being LINQ or not) is a fast lookup by dt1[X_NO] field. Since as you said it is unique, you can build and use a dictionary for that:
var dr1ByXNo = dt1.AsEnumerable().ToDictionary(dr => dr.Field<string>(X_NO));
and then modify your process like this:
foreach (DataRow drY in dt2.Rows)
{
if (dr1ByXNo.TryGetValue(drY.Field<string>(X_ITEM_KEY), out dr0) ||
dr1ByXNo.TryGetValue(drY.Field<string>(X_ITEM_CODE), out dr0))
{
dr0[X_ITEM_CODE] = drY[X_ITEM_CODE];
dr0[X_ITEM_KEY] = drY[X_ITEM_KEY];
dr0[X_COST] = clsD.GetInt(drY[X_COST]); // If null, return 0.
dr0[X_DESC] = clsD.GetStr(drY[X_DESC]); // If null, return "".
dr0[X_QTY] = clsD.GetInt(drY[X_QTY]);
}
else
{
// Not Found in ITEM CODE or KEY, add it.
drO = dtOutput.NewRow();
drO[X_ITEM_CODE] = drY[X_ITEM_CODE];
drO[X_ITEM_KEY] = drY[X_ITEM_KEY];
drO[X_COST] = clsD.GetInt(drY[X_COST]);
drO[X_DESC] = clsD.GetStr(drY[X_DESC]);
drO[X_QTY] = clsD.GetInt(drY[X_QTY]);
dt1.Rows.Add(drO);
}
}
Since you are adding new records to the dt1 during the process, depending of your requirements you might need to add at the end of the else (after dt1.Rows.Add(drO); line) the following
dr1ByXNo.Add(dr0.Field<string>(X_NO), dr0);
I didn't include it because I don't see your code setting the new record X_NO field, so the above will produce duplicate key exception.

How to sort a list after AddRange?

new to C#, SQL and Linq. I have two lists, one "dataTransactions" (fuel from gas stations) and a similar one "dataTransfers" (fuel from slip tanks).
They each access a different table from SQL and get combined later.
List<FuelLightTruckDataSource> data = new List<FuelLightTruckDataSource>();
using (SystemContext ctx = new SystemContext())
{
List<FuelLightTruckDataSource> dataTransactions
= ctx.FuelTransaction
.Where(tx => DbFunctions.TruncateTime(tx.DateTime) >= from.Date && DbFunctions.TruncateTime(tx.DateTime) <= to.Date
//&& tx.AssetFilled.AssignedToEmployee.Manager
&& tx.AssetFilled.AssignedToEmployee != null
//&
&& tx.AssetFilled.AssetType.Code == "L"
&& (tx.FuelProductType.FuelProductClass.Code == "GAS" || tx.FuelProductType.FuelProductClass.Code == "DSL"))
.GroupBy(tx => new { tx.AssetFilled, tx.DateTime, tx.FuelProductType.FuelProductClass, tx.FuelCard.FuelVendor, tx.City, tx.Volume, tx.Odometer}) //Added tx.volume to have individual transactions
.Select(g => new FuelLightTruckDataSource()
{
Asset = g.FirstOrDefault().AssetFilled,
Employee = g.FirstOrDefault().AssetFilled.AssignedToEmployee,
ProductClass = g.FirstOrDefault().FuelProductType.FuelProductClass,
Vendor = g.FirstOrDefault().FuelCard.FuelVendor,
FillSource = FuelFillSource.Transaction,
Source = "Fuel Station",
City = g.FirstOrDefault().City.ToUpper(),
Volume = g.FirstOrDefault().Volume,
Distance = g.FirstOrDefault().Odometer,
Date = g.FirstOrDefault().DateTime
})
.ToList();
In the end, I use
data.AddRange(dataTransactions);
data.AddRange(dataTransfers);
to put the two lists together and generate a fuel consumption report.
Both lists are individually sorted by Date, but after "AddRange" the "dataTransfers" just gets added to the end, losing my sort by Date. How do I sort the combined result again by date after using the "AddRange" command?

Try this:
data = data.OrderBy(d => d.Date).ToList();
Or if you want to order descending:
data = data.OrderByDescending(d => d.Date).ToList();

You can call List<T>.Sort(delegate).
https://msdn.microsoft.com/en-us/library/w56d4y5z(v=vs.110).aspx
Example:
data.Sort(delegate(FuelLightTruckDataSource x, FuelLightTruckDataSource y)
{
// your sort logic here.
});
Advantage: this sort doesn't create a new IList<T> instance as it does in OrderBy. it's a small thing, but to some people this matters, especially for performance and memory sensitive situations.

Obtaining multiple fields with LINQ group statement

I'm using the following grouping within my LINQ statement.
I have figured out how to obtain the maximum date from the 'notes' table, however am struggling to find an efficient way to find the 'NoteText' property of the same record (highlighted with ???? in both places)
group new { t1, notes } by new
{
t1.Opportunity_Title
} into g
let latestNoteDate = g.Max(uh => uh.notes.Date)
let latestNote = g.Max(uh => uh.notes.NoteText) < needs to be latest note for record above ^
select new PipelineViewModel
{
LastNoteDate = latestNoteDate,
LastNote = latestNote, ????
}).Take(howMany);

Perhaps like this:
let latestNote = latestNoteDate == null ? null :
g.First(x => x.notes.Date == latestNoteDate).NoteText

Using LINQ to SQL to perform an update/set

So, if I use this query directly or by using db.ExecuteCommand() , everything will work fine;
update Market..Area set EndDate = NULL where ID = 666 and NID =1 and Code = 36003
However, I cant seem to do this in LINQ to SQL, I've tried a few different methods that all seem like they should work, here is an example of one:
var s= db.Area.Single(s => s.ID == 666 && s.Code == 36003 && s.NID == 1);
s.EndDate = null;
db.SubmitChanges();
I dont know what else to try to get this working.
EDIT
I am only trying to edit ONE item

Is there a primary key defined on the Area table?
Linq 2 SQL will not make an update to a table without a primary key defined. (And, as far as I can remember, it will fail silently).

Do you want update more than one item? Even not you can write something like:
IQueryable<Area> iArea =
from s in db.Area
where s.ID == 666 && s.Code == 36003 && s.NID == 1
select s;
iArea.ToList().ForEach(item => { item.EndDate = null; });
db.SubmitChanges();

There is no built in method for doing batch updates. But you can pick some batch extensions from this blog.

Your syntax appears to be correct. The only other thing I can think of which would be causing the failure is if you are trying to do multiple updates within the same data context. Try this:
using (DataContext db = new DataContext())
{
var s = db.Area.Single(s => s.ID == 666 && s.Code == 36003 && s.NID == 1);
s.EndDate = null;
db.SubmitChanges();
}

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Group by and find MAx value in a DataTable rows - c#

You can use LINQ to achieve this. I think the following will work (don't have VS to test): var grouped = memberSelectedTiers.AsEnumerable() .GroupBy(r => r.Field<int>("EmpId")) .Select(grp => new { EmpId = grp.Key , MaxDate = grp.Max(e => e.Field<DateTime>("Insert_Date")) });

Daniel Kelley, your answer helped me and that's great, but did you notice the OP stated he didn't want to use LINQ?

Related

How to select row from a table based on two values c#?

C# - Linq to combine (or) join two datatables into one

How to sort a list after AddRange?

Obtaining multiple fields with LINQ group statement

Using LINQ to SQL to perform an update/set

Categories

Resources