AMO get partitions where data are processed but not indexes - c#

I am writing a script that return all unprocessed partitions within a measure group using the following command:
objMeasureGroup.Partitions.Cast<Partition>().Where(x => x.State != AnalysisState.Processed)
After doing some experiments, it looks like this property indicates if the data is processed and doesn't mention the indexes.
After searching for hours, i didn't find any method to list the partitions where data is processed but indexes are not.
Any suggestions?
Environment:
SQL Server 2014
SSAS multidimensional cube
Script are written within a SSIS package / Script task

First, ProcessIndexes is an incremental operation. So if you run it twice the second time will be pretty quick because there is nothing to do. So I would recommend just running it on the cube and not worrying about whether it was previously run. However if you do need to analyze the current state then read on.
The best way (only way I know of) to distinguish whether ProcessIndexes has been run on a partition is to study the DISCOVER_PARTITION_STAT and DISCOVER_PARTITION_DIMENSION_STAT DMVs as seen below.
The DISCOVER_PARTITION_STAT DMV returns one row per aggregation with the rowcount. The first row of that DMV has a blank aggregation name and represents the rowcount of the lowest level data processed in that partition.
The DISCOVER_PARTITION_DIMENSION_STAT DMV can tell you about whether indexes are processed and which range of values by each dimension attribute are in this partition (by internal IDs, so not super easy to interpret). We assume at least one dimension attribute is set to be optimized so it will be indexed.
You will need to add a reference to Microsoft.AnalysisServices.AdomdClient also to simplify running these DMVs:
string sDatabaseName = "YourDatabaseName";
string sCubeName = "YourCubeName";
string sMeasureGroupName = "YourMeasureGroupName";
Microsoft.AnalysisServices.Server s = new Microsoft.AnalysisServices.Server();
s.Connect("Data Source=localhost");
Microsoft.AnalysisServices.Database db = s.Databases.GetByName(sDatabaseName);
Microsoft.AnalysisServices.Cube c = db.Cubes.GetByName(sCubeName);
Microsoft.AnalysisServices.MeasureGroup mg = c.MeasureGroups.GetByName(sMeasureGroupName);
Microsoft.AnalysisServices.AdomdClient.AdomdConnection conn = new Microsoft.AnalysisServices.AdomdClient.AdomdConnection(s.ConnectionString);
conn.Open();
foreach (Microsoft.AnalysisServices.Partition p in mg.Partitions)
{
Console.Write(p.Name + " - " + p.State + " - ");
var restrictions = new Microsoft.AnalysisServices.AdomdClient.AdomdRestrictionCollection();
restrictions.Add("DATABASE_NAME", db.Name);
restrictions.Add("CUBE_NAME", c.Name);
restrictions.Add("MEASURE_GROUP_NAME", mg.Name);
restrictions.Add("PARTITION_NAME", p.Name);
var dsAggs = conn.GetSchemaDataSet("DISCOVER_PARTITION_STAT", restrictions);
var dsIndexes = conn.GetSchemaDataSet("DISCOVER_PARTITION_DIMENSION_STAT", restrictions);
if (dsAggs.Tables[0].Rows.Count == 0)
Console.WriteLine("ProcessData not run yet");
else if (dsAggs.Tables[0].Rows.Count > 1)
Console.WriteLine("aggs processed");
else if (p.AggregationDesign == null || p.AggregationDesign.Aggregations.Count == 0)
{
bool bIndexesBuilt = false;
foreach (System.Data.DataRow row in dsIndexes.Tables[0].Rows)
{
if (Convert.ToBoolean(row["ATTRIBUTE_INDEXED"]))
{
bIndexesBuilt = true;
break;
}
}
if (bIndexesBuilt)
Console.WriteLine("indexes have been processed. no aggs defined");
else
Console.WriteLine("no aggs defined. need to run ProcessIndexes on this partition to build indexes");
}
else
Console.WriteLine("need to run ProcessIndexes on this partition to process aggs and indexes");
}

I am posting this answer as additional information of #GregGalloway excellent answer
After searching for a while, the only way to know if partition are processed is using DISCOVER_PARTITION_STAT and DISCOVER_PARTITION_DIMENSION_STAT.
I found an article posted by Daren Gossbel describing the whole process:
SSAS: Are my Aggregations processed?
In the artcile above the author provided two methods:
using XMLA
One way in which you can find it out with an XMLA discover call to the DISCOVER_PARTITION_STAT rowset, but that returns the results in big lump of XML which is not as easy to read as a tabular result set.
example
<Discover xmlns="urn:schemas-microsoft-com:xml-analysis">
<RequestType>DISCOVER_PARTITION_STAT</RequestType>
<Restrictions>
<RestrictionList>
<DATABASE_NAME>Adventure Works DW</DATABASE_NAME>
<CUBE_NAME>Adventure Works</CUBE_NAME>
<MEASURE_GROUP_NAME>Internet Sales</MEASURE_GROUP_NAME>
<PARTITION_NAME>Internet_Sales_2003</PARTITION_NAME>
</RestrictionList>
</Restrictions>
<Properties>
<PropertyList>
</PropertyList>
</Properties>
</Discover>
using DMV queries
If you have SSAS 2008, you can use the new DMV feature to query this same rowset and return a tabular result.
example
SELECT *
FROM SystemRestrictSchema($system.discover_partition_stat
,DATABASE_NAME = 'Adventure Works DW 2008'
,CUBE_NAME = 'Adventure Works'
,MEASURE_GROUP_NAME = 'Internet Sales'
,PARTITION_NAME = 'Internet_Sales_2003')
Similar posts:
How to find out using AMO if aggregation exists on partition?
Detect aggregation processing state with AMO?

Related

Checking duplication using LINQ in collection

I have function which inserts record in database. I want to make sure that there are no duplicate entries in database. Function first checks if there is query string parameter. If there is, then it acts like edit mode otherwise insert mode. There is a function which can return currently added records in database. I need to check duplication based on two columns before insertion in database.
myService = new myService();
myFlow mf = new myFlow();
if (!string.IsNullOrEmpty(Request["myflowid"]))
{
mf = myService.Getmyflow(Convert.ToInt32(Request["myflowid"]));
}
int workcount = 0;
int.TryParse(txtWorkCount.Text, out workcount);
mf.Name = txtName.Text.Trim();
mf.Description = txtDescription.Text.Trim();
mf.FunctionCode = txtFunctioneCode.Text.Trim();
mf.FunctionType = txtFunctioneType.Text.Trim();
mf.WorkCount = workcount;
if (mf.WorkFlowId == 0)
{
mf.SortOrder = 0;
mf.Active = true;
mf.RecordDateTime = DateTime.Now;
message = "Saved Successfully";
}
else
{
_editMode = true;
message = "Update Successfully";
}
}
int myflowId = mfService.AddEditmyflow(mf);
I want to check duplication based on functiontype and functioncode. Another function mfService.Getmyflows() can return currently added records in database.
How can I check duplication using Linq?
First of all, what database do you use? Many databases support upsert behavior (update or insert depending of was data found or not). For example, MERGE in ms sql, MERGE in oracle, INSERT .. ON DUPLICATE in mysql and so on. This could be preferred solution. Upsert is usually an atomic operation.
In your particular case do you you transactions? Are you sure no one will insert data after you ensured about duplicates but before you have inserted your record? Example:
#1 thread #2 thread
look for duplicates
... look for duplicate
no duplicates found ...
no duplicates found
insert data_1
insert data_1
This will end up with duplicates you trying to avoid.
According to your code you populating data from GUI and adding only one item.
If you have access to myService code you could add method to query item by your two columns, instead of querying all items via mfService.Getmyflows() and looking through this collection inside your code. It would be more performant (especially if you have indexes in that columns) and more memory efficient.
And finally, existing of a single element inside collection can be easily done:
var alreadyExist = mfService.Getmyflows()
.Any(x => x.Column1 == value1 && x.Column2 == value2);

Trying to get a COMPLETE list of AWS Route53 records. How to get more than the first 100?

I am trying to list records on route53 that we have but am failing to get more than the first 100 results! How can I list ALL of them? How can I also filter the results to list only those with a specific RecordType?
This is the code I tried running but I fail to get a complete list…:
string recordList = "";
int ii = 0;
ListResourceRecordSetsResponse result = r53client.ListResourceRecordSets(request);
System.Windows.Forms.MessageBox.Show(result.ListResourceRecordSetsResult.ResourceRecordSets.Count+" records!");
while (result.ListResourceRecordSetsResult.ResourceRecordSets.Count > 0)
{
foreach (var recordSet in result.ListResourceRecordSetsResult.ResourceRecordSets)
{
if (recordSet.Type == "CNAME")
{
foreach (var resourceRecord in recordSet.ResourceRecords)
{
recordList += resourceRecord.Value + "\n";
jj++;
// set first record to get next, as the last one we already got!
request.StartRecordName = resourceRecord.Value;
}
}
}
result = r53client.ListResourceRecordSets(request);
}
From Github: Can you get more than 100 ListResourceRecordSets:
If the response says IsTruncated is true, then you need to pass the values of NextRecordType, NextRecordName, and NextRecordIdentifier back to a new call.
From AWS Command-Line Interface (CLI) documentation for list-resource-record-sets:
To retrieve all the records in a HostedZone, first pause any processes making calls to ChangeResourceRecordSets. Initially call list-resource-record-sets without a Name and Type to get the first page of record sets. For subsequent calls, set Name and Type to the NextName and NextType values returned by the previous response.
From Class ListResourceRecordSetsRequest:
To retrieve all the records in a HostedZone, first pause any processes making calls to ChangeResourceRecordSets. Initially call ListResourceRecordSets without a Name and Type to get the first page of record sets. For subsequent calls, set Name and Type to the NextName and NextType values returned by the previous response.

Entity Framework and whole word matching - paging and filtering results

Background:
I am using ASP.NET MVC4, SQL Server 2008 R2, and Entity Framework 5 for a website.
The site accepts a delimited list of keywords to search database content on. It also needs to page the results to the user (currently 100 results per page).
This was going along smoothly until it was requested that the keyword searching is not done with partial matching, but whole word matching.
The problem
Performing the whole word match AFTER I already have the results back means that I might not have query.Pagesize of results to show - which messes up the UI paging. Of the 100 partial matches from SQL Server on the first page, 20 may end being removed with the whole word processing.
I currently am building my query using LINQ and doing a AND search on the keywords like so:
// Start with all the MyItems
var results = UnitOfWork.MyItemRepository.GetAll();
// Loop the keywords to AND them together
foreach(var keyword in query.Keywords)
{
var keywordCopy = keyword;
// Look for a hit on the keyword in the MyItem
results = results.Where(x => x.Title.Contains(keywordCopy));
}
And later on getting the total number of results, paging, and executing the query:
var totalCount = results.Count();
// Page the results
results = results.Skip((query.Page - 1) * query.Pagesize).Take(query.Pagesize);
...
// Finalize the query and execute it
var list = results.ToList();
Because I need to do whole word matching and not partial, I am processing with a regex the keywords and removing non-matches from list.
var keywordsRegexPattern = "^" + string.Concat(query.Keywords.Select(keyword => string.Format(#"(?=.*\b{0}\b)", Regex.Escape(keyword))));
foreach(var item in list.ToList())
{
var searchableData = some combined string of item data
// See if the keywords are whole word matched in the combined content
var isMatch = Regex.IsMatch(searchableData, keywordsRegexPattern, RegexOptions.IgnoreCase | RegexOptions.Singleline);
// If not a match, remove the item from the results
if(!isMatch)
{
list.Remove(item);
}
}
// Change list into custom list of paged items for the UI
var pagedResult = new PagedList<MyItem>(list, query.Page, query.Pagesize, totalCount);
return pagedResult;
Question
Does anyone know of a way to do whole word matching with EF and do result paging?
Ideas I've come up with but don't like:
Chunk the results. 100 results back, 20 partial keyword matches removed, go get another 20, repeat. This could result in doing multiple queries when getting all the data at once would have been faster. It also means it would be stealing potential results from the next page which would have to be tracked with some sort of offset.
Get ALL the rows back (no SQL paging), then process and page in C#. This seems bad to get all the results back every time.
Well I see two alternatives (I may miss something easier, but anyway)
Either you use string.Contains(keyword), retrieve all the corresponding datas from db, then filter with exact matching and make paging on the enumerated result (so you probably get "not too much result" from db).
The other way :
foreach(var keyword in query.Keywords)
{
//add space at start or end of keyword for contains
var containsKeyword = string.Format(" {0} ", keyword);
//add space at end only for startsWith
var startsWithKeyword = string.Format("{0} ", keyword);
//add space at start only for endsWith
var endsWithKeyword = string.Format(" {0}", keyword);
// Look for a hit on the keyword in the MyItem
results = results.Where(x => x.Title.Contains(containsKeyword) || x.Title.StartsWith(startsWithKeyword) || x.Title.EndsWith(endsWithKeyword));
}

Newbie performance issue with foreach ...need advice

This section simply reads from an excel spreadsheet. This part works fine with no performance issues.
IEnumerable<ImportViewModel> so=data.Select(row=>new ImportViewModel{
PersonId=(row.Field<string>("person_id")),
ValidationResult = ""
}).ToList();
Before I pass to a View I want to set ValidationResult so I have this piece of code. If I comment this out the model is passed to the view quickly. When I use the foreach it will take over a minute. If I hardcode a value for item.PersonId then it runs quickly. I know I'm doing something wrong, just not sure where to start and what the best practice is that I should be following.
foreach (var item in so)
{
if (db.Entity.Any(w => w.ID == item.PersonId))
{
item.ValidationResult = "Successful";
}
else
{
item.ValidationResult = "Error: ";
}
}
return View(so.ToList());
You are now performing a database call per item in your list. This is really hard on your database and thus your performance. Try to itterate trough your excel result, gather all users and select them in one query. Make a list from this query result (else the query call is performed every time you access the list). Then perform a match between the result list and your excel.
You need to do something like this :
var ids = so.Select(i=>i.PersonId).Distinct().ToList();
// Hitting Database just for this time to get all Users Ids
var usersIds = db.Entity.Where(u=>ids.Contains(u.ID)).Select(u=>u.ID).ToList();
foreach (var item in so)
{
if (usersIds.Contains(item.PersonId))
{
item.ValidationResult = "Successful";
}
else
{
item.ValidationResult = "Error: ";
}
}
return View(so.ToList());

bulk insert and update with ADO.NET Entity Framework

I am writing a small application that does a lot of feed processing. I want to use LINQ EF for this as speed is not an issue, it is a single user app and, in the end, will only be used once a month.
My questions revolves around the best way to do bulk inserts using LINQ EF.
After parsing the incoming data stream I end up with a List of values. Since the end user may end up trying to import some duplicate data I would like to "clean" the data during insert rather than reading all the records, doing a for loop, rejecting records, then finally importing the remainder.
This is what I am currently doing:
DateTime minDate = dataTransferObject.Min(c => c.DoorOpen);
DateTime maxDate = dataTransferObject.Max(c => c.DoorOpen);
using (LabUseEntities myEntities = new LabUseEntities())
{
var recCheck = myEntities.ImportDoorAccess.Where(a => a.DoorOpen >= minDate && a.DoorOpen <= maxDate).ToList();
if (recCheck.Count > 0)
{
foreach (ImportDoorAccess ida in recCheck)
{
DoorAudit da = dataTransferObject.Where(a => a.DoorOpen == ida.DoorOpen && a.CardNumber == ida.CardNumber).First();
if (da != null)
da.DoInsert = false;
}
}
ImportDoorAccess newIDA;
foreach (DoorAudit newDoorAudit in dataTransferObject)
{
if (newDoorAudit.DoInsert)
{
newIDA = new ImportDoorAccess
{
CardNumber = newDoorAudit.CardNumber,
Door = newDoorAudit.Door,
DoorOpen = newDoorAudit.DoorOpen,
Imported = newDoorAudit.Imported,
RawData = newDoorAudit.RawData,
UserName = newDoorAudit.UserName
};
myEntities.AddToImportDoorAccess(newIDA);
}
}
myEntities.SaveChanges();
}
I am also getting this error:
System.Data.UpdateException was unhandled
Message="Unable to update the EntitySet 'ImportDoorAccess' because it has a DefiningQuery and no element exists in the element to support the current operation."
Source="System.Data.SqlServerCe.Entity"
What am I doing wrong?
Any pointers are welcome.
You can do multiple inserts this way.
I've seen the exception you're getting in cases where the model (EDMX) is not set up correctly. You either don't have a primary key (EntityKey in EF terms) on that table, or the designer has tried to guess what the EntityKey should be. In the latter case, you'll see two or more properties in the EDM Designer with keys next to them.
Make sure the ImportDoorAccess table has a single primary key and refresh the model.

Categories

Resources