Check if I have already inserted a particular record

Check if I have already inserted a particular record - c#

I am going through a massive list of business objects and inserting them into the database using Linq-to-Sql.
Some of the business objects contain a payment method record (cheques, credit card etc..)
When it comes to adding a payment method, I want to check to ensure I havent already added it, cos otherwise it will rant at me when I come to Submit my changes.
if ( !context.PaymentMethods.Any ( paymentMethod => paymentMethod.PaymentMethodID == iPaymentMethod.PaymentMethodID ) )
{
PaymentMethod method = new PaymentMethod ();
method.PaymentMethodID = iPaymentMethod.PaymentMethodID;
// etc...
context.PaymentMethods.InsertOnSubmit ( method );
}
This doesnt work, I presume because Any is checking the database and not the list of objects I am about to Insert on Submit.
I know I can maintain my own list to check if the records have already been added, but to save a lot of hassle, I was just wondering if there was a tidy, Linq way to do this? Any way to check context.PaymentMethods to see if it has been added?

A possible solution would be to check the ChangeSet of the Context:
Func<PaymentMethod,bool> f =
paymentMethod => paymentMethod.PaymentMethodID == iPaymentMethod.PaymentMethodID;
if (!context.PaymentMethods.Any(f) &&
!context.GetChangeSet().Inserts.OfType<PaymentMethod>().Any(f))
{
// Submit
}

Try this:
!context.PaymentMethods.Where(paymentMethod => paymentMethod.PaymentMethodID == iPaymentMethod.PaymentMethodID).Count() = 0

Related

C# : how to declare a variable for an entity table?

I'm using C# Entity Framework to select records from a database table. Based on selection criteria, I will use different select statements.
if ( condition1 )
{
var records = _context.Member.Select( .... )
}
else if (condition2)
{
var records = _context.Member.Select( .....)
}......
And then, I need to make some decisions depending on whether there are records and process those records.
if (records != null)
{
}
else if (....)
The compiler complains that "records" does not exist in the current context. I think the reason why this happens is that records is declared in a if block. I don't want to do the 2nd step inside the if block because the processing is the same and it quite lengthy.
I've tried declaring record outside the block first, but I don't know what Type to use. So how do I declare a variable to hold the records return from Entity Framework?
Thanks.
Edit: After reading the comments and Answer. I think I know where my confusion lies. If my select is a new anonymous object, what should my type be ?
var records = _context.Member
.Select(x => new {Name = x.name, Address = x.address} );
When I hover over the Select, it says:
Returns: An IQueryable<out T> that contains elements from the input sequence that satisfy the condition specified by predicate.
Types: ‘a is new { .... }
As it's an anonymous object, what should I state as the type for it ?
Thanks again for the great help.

What data type is records? Find that out and lets call that T.
If you are using Visual Studio, just hover over the Select method. It will popup some information about method, there is also a return type before the name of the method.
Then write this code:
T records = null; // or some kind of empty collection
if ( condition1 )
{
records = _context.Member.Select( .... )
}
else if (condition2)
{
records = _context.Member.Select( .....)
}
The reason you ran in to this problem is that 'records' is defined only in the scope of '{}' curly braces. By bringing it up like i showed you, you move it to the same context where you have the decisions
if (records != null)
{
}
else if (....)

How to improve the performance of Entity Framework Code?

In the project, I need to call an external API based on time. So, for one day, I may need to call the API 24 times, one call for one hour period. The API result is a XML file which has 6 fields. I will need to insert these data into a table. Averagely, for each hour, it has about 20,000 rows data.
The table has these 6 columns:
col1, col2, col3, col4, col5, col6
When all 6 columns are the same, we consider the rows are the same, and we should not insert duplications.
I'm using C# and Entity Framework for this:
foreach (XmlNode node in nodes)
{
try
{
count++;
CallData data = new CallData();
...
// get all data and set in 'data'
// check whether in database already
var q = ctx.CallDatas.Where(x => x.col1 == data.col1
&& x.col2 == data.col2
&& x.col3 == data.col3
&& x.col4 == data.col4
&& x.col5 == data.col5
&& x.col6 == data.col6
).Any();
if (q)
{
// exists in database, skip
// log info
}
else
{
string key = $"{data.col1}|{data.col2}|{data.col3}|{data.col4}|{data.col5}|{data.col6}";
// check whether in current chunk already
if (dic.ContainsKey(key))
{
// in current chunk, skip
// log info
}
else
{
// insert
ctx.CallDatas.Add(data);
// update dic
dic.Add(key, true);
}
}
}
catch (Exception ex)
{
// log error
}
}
Logger.InfoFormat("Saving changes ...");
if (ctx.ChangeTracker.HasChanges())
{
await ctx.SaveChangesAsync();
}
Logger.InfoFormat("Saving changes ... Done.");
The code works fine. However, we will need to use this code to run for past several months. The issue is: the code runs slow since for each row it will need to check whether it exists already.
Is there any suggestions to improve the performance?
Thanks

You don't show the code on when the context is created or the life-cycle. I'm inclined to point you to your indexes on the table. If these aren't primary keys then you might see the performance issue there. If you are doing full table scans, it will be progressively slower. With that said, there are two separate ways to handle the
The EF Native way: You can explicitly create a new connection on each interaction (avoiding change tracking for all entries reducing progressive slowdown). Also, your save is async but your *Any statement is sync. Using async for that as well might help take some pressure off the current thread if it's waiting.
// Start your context scope closer to the data call, as if the look is long
// running you could be building up tracked changes in the cache, this prevents
// that situation.
using (YourEntity ctx = new YourEntity())
{
CallData data = new CallData();
if (await ctx.CallDatas.Where(x => x.col1 == data.col1
&& x.col2 == data.col2
&& x.col3 == data.col3
&& x.col4 == data.col4
&& x.col5 == data.col5
&& x.col6 == data.col6
).AnyAsync()
)
{
// exists in database, skip
// log info
}
else
{
string key = $"{data.col1}|{data.col2}|{data.col3}|{data.col4}|{data.col5}|{data.col6}";
// check whether in current chunk already
if (dic.ContainsKey(key))
{
// in current chunk, skip
// log info
}
else
{
// insert
ctx.CallDatas.Add(data);
await ctx.SaveChangesAsync();
// update dic
dic.Add(key, true);
}
}
}
Optional Way: Look into inserting the data using a bulk operation via store procedure. 20k rows is trivial, and you can still use entity framework for that as well. See https://stackoverflow.com/a/9837927/1558178
I have created my own version of this (customized for my specific needs) and have found that it works well and give more control for bulk inserts.
I have used this ideology to insert 100k records at a time. I have my logic in the stored procedure for checking for duplicates which gives me better control as well as reducing the over the wire call to 0 reads and 1 write. This should just take a second or two to execute assuming your stored procedure is optimized.

Different approach:
Save all rows with duplicates - should be very efficient
When you use data from the table use DISTINCT for all fields.

For raw, bulk operations like this I would consider avoiding EF entities and context tracking and merely execute SQL through the context:
var sql = $"IF NOT EXISTS(SELECT 1 FROM CallDates WHERE Col1={data.Col1} AND Col2={data.Col2} AND Col3={data.Col3} AND Col4={data.Col4} AND Col5={data.Col5} AND Col6={data.Col6}) INSERT INTO CallDates(Col1,Col2,Col3,Col4,Col5,Col6) VALUES ({data.Col1},{data.Col2},{data.Col3},{data.Col4},{data.Col5},{data.Col6})";
context.Database.ExeculeSqlCommand(sql);
This does without the extra checks and logging, just effectively raw SQL with duplicate detection.

Performance of Related tables in calculated properties

Looking to see if there is a better way to do this.
I am using DB first and have a table called Items. Below is a calculated property that I specify on a partial class to extend it that uses related tables to derive the result. This technically works fine. I like the ease of using it, the fact that all this business logic is defined once in the domain, and that you can use complex code to derive the results.
The only issue I am concerned with is performance, when you pull back multiple records. Using SQL Profiler, I can see that if you pull back 50 rows of Item, it will execute an additional query to retrieve the Work Order Details in this case, 50 times! Not sure why it is not doing a join instead of doing 50 additional reads??? And I have more than one calculated property like this going out to multiple tables and each one is doing an explicit read per row = slow!
The result from pulling back 50 rows from Item table, is 2,735 reads from the database as indicated by SQL Profiler!!! I am not that familiar with SQL Profiler so maybe I am mis-interpreting somthing, but I know it is doing a lot of DB reads.
Why doesn't it do a join instead of doing an explicit read to the related tables for each row in Items?
What is "Best Practice" to accomplish this? Is there a better way?
.
[Display(Name = "Qty Allocated")]
public decimal QtyAllocated
{
get
{
if (this.TrackInventory)
{
var inProcessNonRemnantWorkOrderDetails = this.WorkOrderDetails.Where(wod =>
new[]
{
(int)WorkOrderStatus.Created,
(int)WorkOrderStatus.Released,
(int)WorkOrderStatus.InProcess
}.Contains(wod.WorkOrderHeader.StatusId)
&& wod.EstimatedQuantity >= 1 //Don't count remnants as allocated
);
var inProcessRemnantWorkOrderDetails = this.WorkOrderDetails.Where(wod =>
new[]
{
(int)WorkOrderStatus.Created,
(int)WorkOrderStatus.Released,
(int)WorkOrderStatus.InProcess
}.Contains(wod.WorkOrderHeader.StatusId)
&& wod.EstimatedQuantity > 0 && wod.EstimatedQuantity < 1 //gets just remnants
);
decimal qtyAllocated =
this.WorkOrderDetails == null
? 0
: inProcessNonRemnantWorkOrderDetails.Sum(a => (a.EstimatedQuantity - a.ActualQuantity));
if (qtyAllocated == 0 && inProcessRemnantWorkOrderDetails.Any())
{
qtyAllocated = 0.1M;
}
return qtyAllocated;
}
else
{
return 0;
}
}
}

Aron was correct. When I eager load the related entities by using the Include() method in my query, there is only 1 hit to the database.

Why can't I update data into database using LINQ to SQL?

I am trying to update data while I am reading them from database, see below. But after the whole thing finish, the data didn't get updated.(my table has primary key ).
static LinqMPISMPPCalenderDataContext DBCalender;
DBCalender = new LinqMPISMPPCalenderDataContext(connectionString);
var ExceptionPeriod= DBCalender.Table_ExceptionPeriods
.Where(table=>Table.StartDate<= Date && table.FinishDate >= Date && table.CalenderID==CalenderID).Single();
Table_ExceptionPeriod TblException =null;
TblException = ExceptionPeriod;
TblException.StartDate = ExceptionPeriod.StartDate.AddDays(1);
DBCalender.SubmitChanges();

Once you've got your object from the DB via your .Single() call you should be able to just set properties on it and call SubmitChanges(). There's no need for the TblException stuff. So ...
static LinqMPISMPPCalenderDataContext DBCalender;
DBCalender = new LinqMPISMPPCalenderDataContext(connectionString);
var ExceptionPeriod = DBCalender.Table_ExceptionPeriods
.Where(table=>Table.StartDate<= Date && table.FinishDate >= Date && table.CalenderID==CalenderID).Single();
ExceptionPeriod.StartDate = ExceptionPeriod.StartDate.AddDays(1);
DBCalender.SubmitChanges();

There doesn't seem to be anything logically wrong with the code, as Antony says you could reduce the number of lines.
I'd probably step through the code line by line, before the submit line check if the StartDate has actually changed.
Only things I can imagine that could be going wrong are some kind of transaction roll back in the database or you're not looking at the record you think you are.

Linq update query - Is there no pretty way?

I want to update my database using a LINQ2SQL query.
However this seems for some reason to be a very ugly task compared to the otherwise lovely LINQ code.
The query needs to update two tables.
tbl_subscription
(
id int,
sub_name nvarchar(100),
sub_desc nvarchar(500),
and so on.
)
tbl_subscription2tags
(
sub_id (FK to tbl_subscription)
tag_id (FK to a table called tbl_subscription_tags)
)
Now down to my update function a send a tbl_subscription entity with the tags and everything.
I can't find a pretty way to update my database..
I can only find ugly examples where I suddenly have to map all attributes..
There most be a smart way to perform this. Please help.
C# Example if possible.
I have tried this with no effect:
public void UpdateSubscription(tbl_subscription subscription)
{
db.tbl_subscriptions.Attach(subscription);
db.Refresh(System.Data.Linq.RefreshMode.OverwriteCurrentValues, subscription);
db.SubmitChanges(System.Data.Linq.ConflictMode.FailOnFirstConflict);
}
Source for this code is here:
http://skyeyefive.spaces.live.com/blog/cns!6B6EB6E6694659F2!516.entry

Why don't just make the changes to the objects and perform a SubmitChanges to the DataContext?
using(MyDataContext dc = new MyDataContext("ConnectionString"))
{
foreach(var foo in dc.foo2)
{
foo.prop1 = 1;
}
dc.SubmitChanges();
}
Otherwise you need to tell us more about the lifecycle of the object you want to manipulate
edit: forgot to wrap in brackets for using

Unless I'm misunderstanding your situation, I think that citronas is right.
The best and easiest way that I've found to update database items through LINQ to SQL is the following:
Obtain the item you want to change from the data context
Change whatever values you want to update
Call the SubmitChanges() method of the data context.
Sample Code
The sample code below assumes that I have a data context named DBDataContext that connects to a database that has a Products table with ID and Price parameters. Also, a productID variable contains the ID of the record you want to update.
using (var db = new DBDataContext())
{
// Step 1 - get the item from the data context
var product = db.Products.Where(p => p.ID == productID).SingleOrDefault();
if (product == null) //Error checking
{
throw new ArgumentException();
}
// Step 2 - change whatever values you want to update
product.Price = 100;
// Step 3 - submit the changes
db.SubmitChanges();
}

I found out that you can use "Attach" as seen in my question to update a table, but apparently not the sub tables. So I just used a few Attach and it worked without having to run through parameters!

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Check if I have already inserted a particular record - c#

A possible solution would be to check the ChangeSet of the Context: Func<PaymentMethod,bool> f = paymentMethod => paymentMethod.PaymentMethodID == iPaymentMethod.PaymentMethodID; if (!context.PaymentMethods.Any(f) && !context.GetChangeSet().Inserts.OfType<PaymentMethod>().Any(f)) { // Submit }

Try this: !context.PaymentMethods.Where(paymentMethod => paymentMethod.PaymentMethodID == iPaymentMethod.PaymentMethodID).Count() = 0

Related

C# : how to declare a variable for an entity table?

How to improve the performance of Entity Framework Code?

Performance of Related tables in calculated properties

Why can't I update data into database using LINQ to SQL?

Linq update query - Is there no pretty way?

Categories

Resources