Checking duplication using LINQ in collection - c#

I have function which inserts record in database. I want to make sure that there are no duplicate entries in database. Function first checks if there is query string parameter. If there is, then it acts like edit mode otherwise insert mode. There is a function which can return currently added records in database. I need to check duplication based on two columns before insertion in database.
myService = new myService();
myFlow mf = new myFlow();
if (!string.IsNullOrEmpty(Request["myflowid"]))
{
mf = myService.Getmyflow(Convert.ToInt32(Request["myflowid"]));
}
int workcount = 0;
int.TryParse(txtWorkCount.Text, out workcount);
mf.Name = txtName.Text.Trim();
mf.Description = txtDescription.Text.Trim();
mf.FunctionCode = txtFunctioneCode.Text.Trim();
mf.FunctionType = txtFunctioneType.Text.Trim();
mf.WorkCount = workcount;
if (mf.WorkFlowId == 0)
{
mf.SortOrder = 0;
mf.Active = true;
mf.RecordDateTime = DateTime.Now;
message = "Saved Successfully";
}
else
{
_editMode = true;
message = "Update Successfully";
}
}
int myflowId = mfService.AddEditmyflow(mf);
I want to check duplication based on functiontype and functioncode. Another function mfService.Getmyflows() can return currently added records in database.
How can I check duplication using Linq?

First of all, what database do you use? Many databases support upsert behavior (update or insert depending of was data found or not). For example, MERGE in ms sql, MERGE in oracle, INSERT .. ON DUPLICATE in mysql and so on. This could be preferred solution. Upsert is usually an atomic operation.
In your particular case do you you transactions? Are you sure no one will insert data after you ensured about duplicates but before you have inserted your record? Example:
#1 thread #2 thread
look for duplicates
... look for duplicate
no duplicates found ...
no duplicates found
insert data_1
insert data_1
This will end up with duplicates you trying to avoid.
According to your code you populating data from GUI and adding only one item.
If you have access to myService code you could add method to query item by your two columns, instead of querying all items via mfService.Getmyflows() and looking through this collection inside your code. It would be more performant (especially if you have indexes in that columns) and more memory efficient.
And finally, existing of a single element inside collection can be easily done:
var alreadyExist = mfService.Getmyflows()
.Any(x => x.Column1 == value1 && x.Column2 == value2);

Related

AMO get partitions where data are processed but not indexes

I am writing a script that return all unprocessed partitions within a measure group using the following command:
objMeasureGroup.Partitions.Cast<Partition>().Where(x => x.State != AnalysisState.Processed)
After doing some experiments, it looks like this property indicates if the data is processed and doesn't mention the indexes.
After searching for hours, i didn't find any method to list the partitions where data is processed but indexes are not.
Any suggestions?
Environment:
SQL Server 2014
SSAS multidimensional cube
Script are written within a SSIS package / Script task
First, ProcessIndexes is an incremental operation. So if you run it twice the second time will be pretty quick because there is nothing to do. So I would recommend just running it on the cube and not worrying about whether it was previously run. However if you do need to analyze the current state then read on.
The best way (only way I know of) to distinguish whether ProcessIndexes has been run on a partition is to study the DISCOVER_PARTITION_STAT and DISCOVER_PARTITION_DIMENSION_STAT DMVs as seen below.
The DISCOVER_PARTITION_STAT DMV returns one row per aggregation with the rowcount. The first row of that DMV has a blank aggregation name and represents the rowcount of the lowest level data processed in that partition.
The DISCOVER_PARTITION_DIMENSION_STAT DMV can tell you about whether indexes are processed and which range of values by each dimension attribute are in this partition (by internal IDs, so not super easy to interpret). We assume at least one dimension attribute is set to be optimized so it will be indexed.
You will need to add a reference to Microsoft.AnalysisServices.AdomdClient also to simplify running these DMVs:
string sDatabaseName = "YourDatabaseName";
string sCubeName = "YourCubeName";
string sMeasureGroupName = "YourMeasureGroupName";
Microsoft.AnalysisServices.Server s = new Microsoft.AnalysisServices.Server();
s.Connect("Data Source=localhost");
Microsoft.AnalysisServices.Database db = s.Databases.GetByName(sDatabaseName);
Microsoft.AnalysisServices.Cube c = db.Cubes.GetByName(sCubeName);
Microsoft.AnalysisServices.MeasureGroup mg = c.MeasureGroups.GetByName(sMeasureGroupName);
Microsoft.AnalysisServices.AdomdClient.AdomdConnection conn = new Microsoft.AnalysisServices.AdomdClient.AdomdConnection(s.ConnectionString);
conn.Open();
foreach (Microsoft.AnalysisServices.Partition p in mg.Partitions)
{
Console.Write(p.Name + " - " + p.State + " - ");
var restrictions = new Microsoft.AnalysisServices.AdomdClient.AdomdRestrictionCollection();
restrictions.Add("DATABASE_NAME", db.Name);
restrictions.Add("CUBE_NAME", c.Name);
restrictions.Add("MEASURE_GROUP_NAME", mg.Name);
restrictions.Add("PARTITION_NAME", p.Name);
var dsAggs = conn.GetSchemaDataSet("DISCOVER_PARTITION_STAT", restrictions);
var dsIndexes = conn.GetSchemaDataSet("DISCOVER_PARTITION_DIMENSION_STAT", restrictions);
if (dsAggs.Tables[0].Rows.Count == 0)
Console.WriteLine("ProcessData not run yet");
else if (dsAggs.Tables[0].Rows.Count > 1)
Console.WriteLine("aggs processed");
else if (p.AggregationDesign == null || p.AggregationDesign.Aggregations.Count == 0)
{
bool bIndexesBuilt = false;
foreach (System.Data.DataRow row in dsIndexes.Tables[0].Rows)
{
if (Convert.ToBoolean(row["ATTRIBUTE_INDEXED"]))
{
bIndexesBuilt = true;
break;
}
}
if (bIndexesBuilt)
Console.WriteLine("indexes have been processed. no aggs defined");
else
Console.WriteLine("no aggs defined. need to run ProcessIndexes on this partition to build indexes");
}
else
Console.WriteLine("need to run ProcessIndexes on this partition to process aggs and indexes");
}
I am posting this answer as additional information of #GregGalloway excellent answer
After searching for a while, the only way to know if partition are processed is using DISCOVER_PARTITION_STAT and DISCOVER_PARTITION_DIMENSION_STAT.
I found an article posted by Daren Gossbel describing the whole process:
SSAS: Are my Aggregations processed?
In the artcile above the author provided two methods:
using XMLA
One way in which you can find it out with an XMLA discover call to the DISCOVER_PARTITION_STAT rowset, but that returns the results in big lump of XML which is not as easy to read as a tabular result set.
example
<Discover xmlns="urn:schemas-microsoft-com:xml-analysis">
<RequestType>DISCOVER_PARTITION_STAT</RequestType>
<Restrictions>
<RestrictionList>
<DATABASE_NAME>Adventure Works DW</DATABASE_NAME>
<CUBE_NAME>Adventure Works</CUBE_NAME>
<MEASURE_GROUP_NAME>Internet Sales</MEASURE_GROUP_NAME>
<PARTITION_NAME>Internet_Sales_2003</PARTITION_NAME>
</RestrictionList>
</Restrictions>
<Properties>
<PropertyList>
</PropertyList>
</Properties>
</Discover>
using DMV queries
If you have SSAS 2008, you can use the new DMV feature to query this same rowset and return a tabular result.
example
SELECT *
FROM SystemRestrictSchema($system.discover_partition_stat
,DATABASE_NAME = 'Adventure Works DW 2008'
,CUBE_NAME = 'Adventure Works'
,MEASURE_GROUP_NAME = 'Internet Sales'
,PARTITION_NAME = 'Internet_Sales_2003')
Similar posts:
How to find out using AMO if aggregation exists on partition?
Detect aggregation processing state with AMO?

How to Modify List of Records in entityFramework

Using Entity Framework,I need to Retreive A List Of Entities Then Manipulate This List Based On Some Conitions, then Save The Final List to Context.
Like This:
Sample
{
int id;
int value;
}
var sampleList=db.samples.toList();
//Add some records to sampleList
sampleList.Add(new sample(){value = 10});
//Change the Value of Some Records in sampleList
sampleList[0].value= 5 ;
db.savechanges()
Added Records to List Are not Tracked And Inserted To DB ,But Changed Values Are Updated.
Strange Behavior Of EF!! Any Explanation???
Thanks!
Hmm, base on your script, what should've been done is something like this.
//var sampleList=db.samples.toList();
//Add some records to sampleList
Sample sampInsertObject = new sample() { value = 10 };
db.samples.Add(sampInsertObject);
db.SaveChanges(); // execute save so that context will execute "INSERT"
/*
EF also executes SCOPE_IDENTITY() after insert to retrieve the
PrimaryKey value from the database to sampInsertObject.
*/
// Change the Value of Some Records in sampleList
var sampUpdateList = db.samples.ToList();
if(sampUpdateList.Count != 0)
{
// Get specific object from sample list
Sample sampUpdateObject = sampUpdateList.ToList()[0];
sampUpdateObject.Value = 5;
db.SaveChanges(); // execute save so that context will execute "UPDATE"
}
Let me first answer the easier question of why your new records were not saved. You cannot just modify the objects and call save changes, you need to either Add or Update the DataBase.
db.Add(sample); // If sample does not exist
db.Update(sample); //If sample already exists and you are updating it
db.SaveChanges();
As for modifying the list and saving that to context. I believe you will have to iterate over the list itself and Add, Delete, Update each Sample Object in it.

Linq to Sql General Help - Insert Statement

I am currently trying to create a new order (which will be shown below) in a web service, and then send that data to insert a new row into the database. For some reason my DBML / Data Context does not allow me to use InsertOnSubmit.
Any ideas? I haven't used Linq to Sql in about 7 months.
Thanks in advance.
[WebMethod]
public string InsertOrderToDatabases()
{
//Start Data Contexts ------
DataContext db = new DataContext(System.Configuration.ConfigurationManager.AppSettings["RainbowCMSConnectionString"]);
DataContext dcSqlOES = new DataContext(System.Configuration.ConfigurationManager.AppSettings["OESConnectionString"]);
//Get table from local database
Table<Schedule> Schedule = db.GetTable<Schedule>();
//Find last order number in databases
var lastOrderNumber = from lOrder in Schedule
orderby lOrder.templ_idn descending
select lOrder.templ_idn;
int firstOrderID;
var firstOrder = lastOrderNumber.FirstOrDefault();
firstOrderID = firstOrder.Value + 1;
qrOrder qrOrd = new qrOrder
{
.... data in here creating a new order
};
//TODO: fix below with an insert on submit
if (qrOrd != null)
{
// **Schedule.InsertOnSubmit(qrOrd);**
}
//db.GetTable<Schedule>().InsertOnSubmit(qrOrd);
try
{
//Submit the changes to the database
db.SubmitChanges();
return "Orders were sent to the databases.";
}
catch ()
{
}
}
Based on your response, it appears that you are using the wrong table, or perhaps the wrong data type. I also noticed that when you declare your localSchedule variable, you declare it as type Table<Schedule>, which means it should contain Schedule entities, not qrOrder entities.
Table<TEntity>.InsertOnSubmit expects a specific strongly typed entity to be passed in. In your case, it is expecting Web_Service.Schedul‌e, but you are trying to pass in a qrOrder.
Schedule.InsertOnSubmit(qrOrd);
That line will not treat to submit changes to connected entity , Try this
db.Schedule.InsertOnSubmit(qrOrd);
db.SubmitChanges();
you can try with
db.GetTable(typeof(Schedule)).InsertOnSubmit(qrOrd);
Or
db.GetTable(qrOrd.GetType()).InsertOnSubmit(qrOrd);

How to retrieve keywords within thousands of records from SQL Server 2008 fast?

Using the query function of entity collection in C# and it takes a long time to load the related records back from SQL Server 2008. Is there any fast way to do this? This is the query function I use:
public void SearchProducts()
{
//Filter by search string array(searchArray)
List<string> prodId = new List<string>();
foreach (string src in searchArray)
{
StoreProductCollection prod = new StoreProductCollection();
prod.Query.Where(prod.Query.StptName.ToLower() == src.ToLower() && prod.Query.StptDeleted.IsNull());
prod.Query.Select(prod.Query.StptName, prod.Query.StptPrice, prod.Query.StptImage, prod.Query.StptStoreProductID);
// prod.Query.es.Top = 4;
prod.Query.Load();
if (prod.Count > 0)
{
foreach (StoreProduct stpt in prod)
{
if (!prodId.Contains(stpt.StptStoreProductID.ToString().Trim()))
{
prodId.Add(stpt.StptStoreProductID.ToString().Trim());
productObjectsList.Add(stpt);
}
}
}
}
You're hitting the database once per searchArray item, this is very wrong.
You might get better performance like this (have no way of testing it, give it a shot):
public void SearchProducts()
{
//Filter by search string array(searchArray)
List<string> prodId = new List<string>();
StoreProductCollection prod = new StoreProductCollection();
// Notice that your foreach() is gone
// replace this
// prod.Query.Where(prod.Query.StptName.ToLower() == src.ToLower() && prod.Query.StptDeleted.IsNull());
// with this (or something similar: point is, you should call .Load() exactly once)
prod.Query.where(prod.Query.StptDeleted.IsNull() && src.Any(srcArrayString => prod.Query.StptName.ToLower()==srcArrayString.ToLower());
prod.Query.Select(prod.Query.StptName, prod.Query.StptPrice, prod.Query.StptImage, prod.Query.StptStoreProductID);
// prod.Query.es.Top = 4;
prod.Query.Load();
// ... rest of your code follows.
}
Given List<string> searchArray containing lowered words :
public void SearchProducts()
{
//Filter by search string array(searchArray)
List<string> prodId = new List<string>();
StoreProductCollection prod = new StoreProductCollection();
prod.Query.Where(searchArray.Contains(prod.Query.StptName.ToLower()) && prod.Query.StptDeleted.IsNull());
prod.Query.Select(prod.Query.StptName, prod.Query.StptPrice, prod.Query.StptImage, prod.Query.StptStoreProductID);
// prod.Query.es.Top = 4;
prod.Query.Load();
if (prod.Count > 0)
{
foreach (StoreProduct stpt in prod)
{
if (!prodId.Contains(stpt.StptStoreProductID.ToString().Trim()))
{
prodId.Add(stpt.StptStoreProductID.ToString().Trim());
productObjectsList.Add(stpt);
}
}
}
}
This way you have only one query for all words.
First of all, put an index on StptName column.
Second, if you need even better performance, write a Stored Procedure in SQL, to do your querying, and map it with Entity Framework.
Let me know if you need explanation on how to do any of the above.
A couple more micro-optimizations you can do if you don't want to write a Stored Procedure:
Write src.ToLower() in a temporary varaible, and than compare prod.Query.StptName.ToLower() to it.
By default, SQL Server queries are case insensitive, so check if that's the case, and if so, you can get rid of the ToLower altogether. You can change case sensitivity through Collation.
EDIT:
To create an Index:
Open the table designer in SQL Server Managment Studio.
Right click anywhere and select Indexes/Keys.
Click Add.
Under Columns add StptName.
Under Is Unique specify whether StptName is unique or not.
Under type select "index".
That's all!
As for mapping stored procedures - here's a nice tutorial:
http://www.robbagby.com/entity-framework/entity-framework-modeling-select-stored-procedures/
(You can jump straight to the "Map in the Select Stored Procedure" Section).

bulk insert and update with ADO.NET Entity Framework

I am writing a small application that does a lot of feed processing. I want to use LINQ EF for this as speed is not an issue, it is a single user app and, in the end, will only be used once a month.
My questions revolves around the best way to do bulk inserts using LINQ EF.
After parsing the incoming data stream I end up with a List of values. Since the end user may end up trying to import some duplicate data I would like to "clean" the data during insert rather than reading all the records, doing a for loop, rejecting records, then finally importing the remainder.
This is what I am currently doing:
DateTime minDate = dataTransferObject.Min(c => c.DoorOpen);
DateTime maxDate = dataTransferObject.Max(c => c.DoorOpen);
using (LabUseEntities myEntities = new LabUseEntities())
{
var recCheck = myEntities.ImportDoorAccess.Where(a => a.DoorOpen >= minDate && a.DoorOpen <= maxDate).ToList();
if (recCheck.Count > 0)
{
foreach (ImportDoorAccess ida in recCheck)
{
DoorAudit da = dataTransferObject.Where(a => a.DoorOpen == ida.DoorOpen && a.CardNumber == ida.CardNumber).First();
if (da != null)
da.DoInsert = false;
}
}
ImportDoorAccess newIDA;
foreach (DoorAudit newDoorAudit in dataTransferObject)
{
if (newDoorAudit.DoInsert)
{
newIDA = new ImportDoorAccess
{
CardNumber = newDoorAudit.CardNumber,
Door = newDoorAudit.Door,
DoorOpen = newDoorAudit.DoorOpen,
Imported = newDoorAudit.Imported,
RawData = newDoorAudit.RawData,
UserName = newDoorAudit.UserName
};
myEntities.AddToImportDoorAccess(newIDA);
}
}
myEntities.SaveChanges();
}
I am also getting this error:
System.Data.UpdateException was unhandled
Message="Unable to update the EntitySet 'ImportDoorAccess' because it has a DefiningQuery and no element exists in the element to support the current operation."
Source="System.Data.SqlServerCe.Entity"
What am I doing wrong?
Any pointers are welcome.
You can do multiple inserts this way.
I've seen the exception you're getting in cases where the model (EDMX) is not set up correctly. You either don't have a primary key (EntityKey in EF terms) on that table, or the designer has tried to guess what the EntityKey should be. In the latter case, you'll see two or more properties in the EDM Designer with keys next to them.
Make sure the ImportDoorAccess table has a single primary key and refresh the model.

Categories

Resources