Page results from Azure Application Insights Analytics API - c#

Is it possible to "page" the results from the Analytics API?
If I use the following Query (via http POST)
{
"query":"customEvents | project customDimensions.FilePath, timestamp
| where timestamp > now(-100d) | order by timestamp desc | limit 25"
}
I get up to 10,000 results back in one result set. Is there any way to use something similar to the $skip that they have for the events API? Like "SKIP 75 TAKE 25" or something to get the 4th page of results.

[edit: this answer is now out of date, there has been a row_number function added to the query language. this answer left for historical purposes if anyone runs into strange queries that look like this answer]
Not easily.
If you can use the /events ODATA query path instead of the /query path, that supports paging. but not really custom queries like you have.
To get something like paging, you need to make a complicated query, and use summarize and makeList and invent a rowNum field in your query, then use mvexpand to re-expand the lists and then filter by the rowNum. it's pretty complicated and unintuitive, something like:
customEvents
| project customDimensions.FilePath, timestamp
| where timestamp > now(-100d)
| order by timestamp desc
// squishes things down to 1 row where each column is huge list of values
| summarize filePath=makelist(customDimensions.FilePath, 1000000)
, timestamp=makelist(timestamp, 1000000)
// make up a row number, not sure this part is correct
, rowNum = range(1,count(strcat(filePath,timestamp)),1)
// expands the single rows into real rows
| mvexpand filePath,timestamp,rowNum limit 1000000
| where rowNum > 0 and rowNum <= 100 // you'd change these values to page
i believe there's already a request on the appinsights uservoice to support paging operators in the query language.
the other assumption here is that data isn't changing in the underlying table while you're doing work. if new data appears between your calls, like
give me rows 0-99
50 new rows appear
give me rows 100-199
then step 3 is actually giving you back 50 duplicate rows that you just got in step 1.

There's a more correct way to do this now, using new operators that were added to the query language since my previous answer.
The two operators are serialize and row_number().
serialize ensures the data is in a shape and order that works with row_number(). Some of the existing operators like order by already create serialized data.
there's also prev() and next() operators that can get the values from previous or next rows in a serialized result.

Related

Is it possible two compare two fields in same IEnumerable using Linq?

In researching this question, I found numerous answers for comparing two different lists. That is not my scenario. I have an IEnumerable of a class with several fields, and I need to filter where one field is greater than another field in the same list.
I can envision many uses for that type of comparison, but I am keeping things very simple in this example.
To give you a better context, here's a simple table made in T-SQL.
T-SQL code:
create table #GetMia1ASummaryBar (
Id int not null identity(1,1),
ShrCreditRate Float null,
NonShrCreditRate Float null
);
insert into #GetMia1ASummaryBar(ShrCreditRate,NonShrCreditRate)
values (null,1.5),(2.5,0.75),(2,2),(1,null);
-- to see the entire table
select * from #GetMia1ASummaryBar;
-- to filter where the first field is greater than the second
select * from #GetMia1ASummaryBar t where t.ShrCreditRate>t.NonShrCreditRate;
drop table #GetMia1ASummaryBar;
Using Linq, I would like to be able to do what I can do very easily in T-SQL: select * from #GetMia1ASummaryBar t where t.ShrCreditRate>t.NonShrCreditRate;
Along those lines, I tried this.
// select where first field is greater than second field
var list = repo.GetMia1ASummaryBar(campus)
.Where(l => l.ShrCreditRate > l.NonShrCreditRate);
While I received no compile errors, I received no records where I should have received at least one.
So instead of this,
Id ShrCreditRate NonShrCreditRate
----------- ---------------------- ----------------------
1 NULL 1.5
2 2.5 0.75
3 2 2
4 1 NULL
I'd like to filter to receive this.
Id ShrCreditRate NonShrCreditRate
----------- ---------------------- ----------------------
2 2.5 0.75
I'm really trying to avoid creating a separate list populated by a for-each loop, which would be a last resort. Is there a simple way to the type of Linq comparison I am trying to make.
Thanks to everyone who contributed in the comments. The short story is that this syntax is indeed valid.
// select where first field is greater than second field
var list = repo.GetMia1ASummaryBar(campus)
.Where(l => l.ShrCreditRate > l.NonShrCreditRate);
The reason the list was empty was because of an underlying dependency with a filter on it. I uncovered this unexpected behavior in an integration test, which once more shows the value of an integration test. (My unit test didn't uncover the unexpected behavior.)

Azure Search Index - Find the list of items

We are using Search Index to run one of our API. The data to the index is populated using the Azure functions which pull data from the database. We could see that the number of records in the database and the Search Service is different. Is there any way to get the list of Keys in the Search Service so that we can compare with the database and see which keys are missing?
Regards,
John
The Azure Search query API is designed for search/filter scenarios, it doesn't offer an efficient way to traverse through all documents.
That said, you can do this reasonably by scanning the keys in order: if you have a field in your index (the key field or another one) that's both filterable and sortable, you can use $select to pull only the keys for each document, 1000 at a time, ordered by that field. After you retrieve the first 1000, don't do $skip (which will limit you to 100,000), instead use a filter that uses greater-than against the field, using the highest value you saw in the previous response. This will allow you to traverse the whole set at reasonable performance, although doing it 1000 at a time will take time.
You can try to search "*". And use orderby and filter to get all data by following example.
I use data metadata_storage_last_modified as filter.
offset skip time
0 --%--> 0
100,000 --%--> 100,000 getLastTime
101,000 --%--> 0 useLastTime
200,000 --%--> 99,000 useLastTime
201,000 --%--> 100,000 useLastTime & getLastTime
202,000 --%--> 0 useLastTime
Because Skip limit is 100k, so we can calculate skip by
AzureSearchSkipLimit = 100k
AzureSearchTopLimit = 1k
skip = offset % (AzureSearchSkipLimit + AzureSearchTopLimit)
If total search count will large than AzureSearchSkipLimit, then apply
orderby = "metadata_storage_last_modified desc"
When skip reach AzureSearchSkipLimit ,then get metadata_storage_last_modified time from end of data. And put metadata_storage_last_modified as next 100k search filer.
filter = metadata_storage_last_modified lt ${metadata_storage_last_modified}

How OrderBy happens in linq when all the records have the same value?

When orderBy is happened on datetime with same value, I am getting different results in different hits from linq to sql .
Let us say some 15 records have the same datetime as one of their field
and if pagination is there for those 15 records and per page limit is 10 in my case, say some 10 records came on 1st run for page 1. Then for page 2 I am not getting the remaining 5 records, but some 5 records from the previous 10 records of page 1.
Question:
How this orderBy and skip and take functions are working and
Why this discrepancy in the result ?
LINQ does not play a role on how the ordering unto the underlying data source is applied. Linq itself is simply an enumerating extension. As per your comment to your question, you are asking how MSSQL applies ordering in a query.
In MSSQL (and most other RDBMS), the ordering on identical values is dependent on the underlying implementation and configuration of the RDBMS. The ordered result for such values can be perceived as random, and can change between identical queries. This does not mean you will see a difference, but you cannot rely on the data to be returned in a specific order.
This has been asked and answered before on SO, here.
This is also described in the community addon comments in this MSDN article.
No ordering is applied beyond that specified in the ORDER BY clause. If all rows have the same value, they can be returned in whatever order is fastest. That's especially evident when a query is executed in parallel.
This means that you can't use paging on results ordered by non-unique values. Each time you make a call the order can change.
In such cases you need to add tie-breaker columns that will ensure unique ordering values, eg the ID of a product ORDER BY Date, ProductID

SQL Linq .Take() latest 20 rows from HUGE database, performance-wise

I'm using EntityFramework 6 and I make Linq queries from Asp.NET server to a azure sql database.
I need to retrieve the latest 20 rows that satisfy a certain condition
Here's a rough example of my query
using (PostHubDbContext postHubDbContext = new PostHubDbContext())
{
DbGeography location = DbGeography.FromText(string.Format("POINT({1} {0})", latitude, longitude));
IQueryable<Post> postQueryable =
from postDbEntry in postHubDbContext.PostDbEntries
orderby postDbEntry.Id descending
where postDbEntry.OriginDbGeography.Distance(location) < (DistanceConstant)
select new Post(postDbEntry);
postQueryable = postQueryable.Take(20);
IOrderedQueryable<Post> postOrderedQueryable = postQueryable.OrderBy(Post => Post.DatePosted);
return postOrderedQueryable.ToList();
}
The question is, what if I literally have a billion rows in my database. Will that query brutally select millions of rows which meet the condition then get 20 of them ? Or will it be smart and realise that I only want 20 rows hence it will only select 20 rows ?
Basically how do I make this query work efficiently with a database that has a billion rows ?
According to http://msdn.microsoft.com/en-us/library/bb882641.aspx Take() function has deferred streaming execution as well as select statement. This means that it should be equivalent to TOP 20 in SQL and SQL will get only 20 rows from the database.
This link: http://msdn.microsoft.com/en-us/library/bb399342(v=vs.110).aspx shows that Take has a direct translation in Linq-to-SQL.
So the only performance you can make is in database. Like #usr suggested you can use indexes to increase performance. Also storing the table in sorted order helps a lot (which is likely your case as you sort by id).
Why not try it? :) You can inspect the sql and see what it generates, and then look at the execution plan for that sql and see if it scans the entire table
Check out this question for more details
How do I view the SQL generated by the Entity Framework?
This will be hard to get really fast. You want an index to give you the sort order on Id but you want a different (spatial) index to provide you with efficient filtering. It is not possible to create an index that fulfills both goals efficiently.
Assume both indexes exist:
If the filter is very selective expect SQL Server to "select" all rows where this filter is true, then sorting them, then giving you the top 20. Imagine there are only 21 rows that pass the filter - then this strategy is clearly very efficient.
If the filter is not at all selective SQL Server will rather traverse the table ordered by Id, test each row it comes by and outputs the first 20. Imagine that the filter applies to all rows - then SQL Server can just output the first 20 rows it sees. Very fast.
So for 100% or 0% selectivity the query will be fast. In between there are nasty mixtures. If you have that this question requires further thought. You probably need more than a clever indexing strategy. You need app changes.
Btw, we don't need an index on DatePosted. The sorting by DatePosted is only done after limiting the set to 20 rows. We don't need an index to sort 20 rows.

Recursive Linq query - Person > Manager > Dept

Say I have a dataset such as this:
PersonId | ManagerId | DepartmentId
========================================
1 null 1
2 1 1
3 1 2
4 2 1
and so on.
I am looking for a Linq query which:
Given a ManagerId and a set of
DepartmentIds will give me all
relevant PersonIds. The query should
return all PersonIds under a manager,
all the way down the tree, not just
those immediately under that manager.
Here's what I've tried so far: http://pastebin.com/zF9dq6wj
Thanks!
Chris.
Using Linq, there's no automatic way to do this (that I've ever heard of) without multiple trips to the database. As such, it's really no different than any other recursive call structure and you can chose between recursive method calls, a while with a System.Collections.Queue (or Stack) object for ids, etc. If your backend database is SQL Server 2008 or higher, you can make use of it's recursive query capabilities, but you'll have to call a sproc to do it as Linq won't be able to make the translation itself.
You cant do recursive queries in Linq2SQL or Linq2Entities. I would suggest writing a View with a CTE and add that to your DataContext file.

Categories

Resources