Elasticsearch PUT documents in strange order - c#

How do set elasticsearch to be adding new documents with PUT method at the end (or the beginning).
Now it's adding new ones sometimes at the end, sometimes in the middle.
I know I can sort results by some field, but I want to see them in browser just ordered by time added without any additional parameters

ES is simply a document store, i.e. there's no inherent insertion order. I'd simply go with adding ?sort=yourdatefield:desc to your URL and you're all set.
If you don't add any specific sort field, it will sort by score which defaults to 1.0, so the order is undefined actually.

Related

How can I implement pagination when I want to order by a dynamically calculated property?

I have an endpoint that returns a list of "Order" objects for a commerce app. Each order can have a number of items attached, each with its own cost; because there is no guarantee that the items' individual cost will not change until the order is placed, the Total Cost for each order is being calculated on the server and then appended to the returned results.
I am now trying to add pagination to this endpoint, and it is my understanding that pagination should always be left to the database when possible. I also get that in order for pagination to work properly, the results must be ordered in the same manner for every page.
You can probably guess where I'm having an issue, how can I implement pagination here when I can only get the element I'd like to order by AFTER I receive my results.
I could simply retrieve all the results in the query and perform the pagination on the server using Skip() and Take(), but that seems incredibly inefficient and probably not the best solution. Any advice on what I could do, besides disabling ordering by that specific property entirely?
By using row_number() you can assign SerialNo to all rows
and you can take input from-to and can take data as
SerialNo between from and to

Extract unique listo of fields from maching document

I am new to Lucene, so maybe i have missunderstood something about how it works.
I have indexed few hundred thousand documents with many string field. For example suppose we have 5 string field (named A,B,C,D,E) and the first 3 are indexed (A,B,C) leaving the last two unindexed, only included into the document (i mean D,E). Values in each field may be duplicate, for example assume that the field A is used to store names, and the name 'Richard' appear many times.
When i apply a query i looking for each term in each field, now for example, suppose i get 3K documents that match my query.
Is it possible to get a list of unique values (distinct) of each fields without scan and group the result? I am particularly interested into this because i apply a limit to the documents i actually read, but i would like to get a complete list of unique values in each fields (even the documents i dont' read) of the matching documents.
If this is possibile, can i apply this logic even for unindexed fields (D,E) ?
When doing the search, it will return to you all the documents that have the query conditions. On that result you can do a highlight (which will slow the process), but you can do something like pagination to return the result in pages if you want.
In the highligher you have many methods you can use (depending on what version of Lucene you are using; I am talking here about the last version 4.8.0) like GetBestTextFragments() which takes a parameter called maxNumberFragments. If you set that parameter to 1 then it will return only one value from that particular field even if there might be multiple values that match the query.
I am not sure if that answers your question, but I hope it helps. Regarding the unindexed fields, I dont think you can do that (although I have never tried it).

Assign unique numbers in database when saving the form

I have a form where you can create an order and when you save it, is checking in the database (using oracle) for the last order number and is assigning the next one to the currently saved order. What I found is that if two users are saving a new order both in the same time or at few seconds apart, because of the connection speed my app is unable to assign different numbers for the newly two created orders. The problem is that both are checking in the same time the last assigned number and both orders get the same number..
I have some ideas but all of them have advantages and disadvantages..
To have the system wait a few seconds and check the order number when the user saves the order. But if both saved in the same time, the check will be done in the same time later and I guess that I will end up with the same problem..
To have the system check the order number (a check is run every time the treeview is refreshed) and see if it’s been duplicated and then let the user know via the treeview with some highlight, that it’s been duplicated. But if any documents are assigned to the order before the check, then I will end up with documents having a different number in the name and inside from the order to which is assigned..
To have the system check all order numbers periodically and give one of the duplicates a new order number, but Here is the same problem with the documents as at #2.. And also might cause some performance issue..
Assigning the order number when a user requests a new order not when he saves the order. I could have the system do Solution #1 along with this solution and recheck to see if the number is being used within the database and then reassign it a new one. Once again, if documents get assigned, someone has to go fix those.
One way of possibly stopping the documents from being assigned to duplicates is that the user is only allowed put some of the information and then save it or apply it and it does the recheck of #1, and then if it doesn't find anything, allow the user to add documents. This part of the solution could be applied possibly to any of the above but I don't want to delay the users work while is checking the numbers..
Please if you see any improvements to the ideas above or if you have new ones, let me know.
I need to find the best solution and as much as possible not to affect the user's current workflow..
If your Order ID is only a number you can use Oracle Sequence.
CREATE SEQUENCE order_id;
And before you save the record get a new order number.
SELECT order_id.NEXTVAL FROM DUAL;
See also Oracle/PLSQL: Sequences (Autonumber)

Retrieve random DB record

What is the best way to retrieve a "X" number of random records using Entity Framework (EF5 if it's relevant). The value of "X" will be set based on where this will be used.
Is there a method for doing this built into EF or is best to pull down a result set and then use a C# random number function to pull the records. Or is there a method that I'm not thinking of?
On the off chance that it's relevant I have a table that stores images that I use for different usages (there is a FK to an image type table). The images that I use in my carousel on the homepage is what I'm wanting to add some variety to...consequently how "random" it is doesn't matter to me much. I'm just trying to get away from the same six or so pictures always being displayed. (Also, I'm not really interested in debating/discussing storing images in a table vs local storage.)
The solution needs to be one using EF via a LINQ statement. If this isn't directly possibly I may end up doing something SIMILAR to what #cmd has recommended in the comments. This would most likely be a matter of retrieving a record count...testing the PK to make sure the resulting object wasn't null and building a LIST of the X number of object's PKs to pass to front end. The carousel lazy loads the images so I don't actually need the image when I'm building the list that will be used by the carousel.
Can you just add an ORDER BY RAND() clause to your query?
See this related question: MySQL: Alternatives to ORDER BY RAND()

Performance issue with accessing Microsoft.Office.Core.DocumentProperties

I have a Excel COM addin which reads the CustomDocumentProperties section of a workbook.
This is how I access a particular entry from the CustomDocumentProperties section
DocumentProperties docProperties = (DocumentProperties)
xlWorkbook.CustomDocumentProperties;
docProperty = docProperties[propName];
The problem is when the CustomDocumentProperties contain more than 8000 entries, the performance of this
code is really bad. I have ran CPU profiler and it showed that the following line takes more than a minute.
docProperty = docProperties[propName];
Does anyone know how to improve the performance of accessing DocumentProperties?
Thanks!
I doubt that there is anything that you could do to improve the performance of the document properties. I believe that it is implemented as a simple list -- not as a dictionary or hash table. In fact, I don't believe that the list is sorted, so with 8000 entries, on average half of them, or 4000, would have to be accessed in order to find the property that you are looking for.
You might consider not using the CustomDocumentProperties as a dictionary. Instead, you might try putting all 8000 of your entries into a custom dictionary, serializing it, and then adding the entire serialized dictionary to the CustomDocumentProperties as a single entry. So to use it, you would access the CustomDocumentProperties, deserialize the dictionary, and then use it repeatedly. When done, if there were any changes to the dictionary, you would have to re-serialize it and save it back to the CustomDocumentProperties, which you would probably only want to do once -- for example, just before saving your workbook. (You might want to put code to re-serialize and save your custom dictionary to the CustomDocumentProperties within the Workbook.BeforeSave event.)

Categories

Resources