I'm trying to build a standalone application that creates a custom report for Encompass360 without needing to put certain fields into the reporting database.
So far I have only found one way to do it, but it is extremely slow. (Much slower than a normal report within encompass when retrieving data outside of the reporting database.) It takes almost 2 minutes to pull the data for 5 loans doing this:
int count = 5;
StringList fields = new StringList();
fields.Add("Fields.317");
fields.Add("Fields.3238");
fields.Add("Fields.313");
fields.Add("Fields.319");
fields.Add("Fields.2");
// lstLoans.Items contains the string location of the loans(i.e. "My Pipeline\Dave#6")
foreach (LoanIdentity loanID in lstLoans.Items)
{
string[] loanIdentifier = loanID.ToString().Split('\\');
Loan loan = Globals.Session.Loans.Folders[loanIdentifier[0]].OpenLoan(loanIdentifier[1]);
bool fundingPlus = true; // if milestone == funding || shipping || suspended || completion;
if (!fundingPlus)
continue;
bool oneIsChecked = false;
LogMilestoneEvents msEvents = loan.Log.MilestoneEvents;
DateTime date;
MilestoneEvent ms = null; // better way to do this probably
if (checkBox4.Checked)
{
ms = msEvents.GetEventForMilestone("Completion");
if (ms.Completed)
{
oneIsChecked = true;
}
}
else if (checkBox3.Checked)
{
ms = msEvents.GetEventForMilestone("Suspended");
if (ms.Completed)
{
oneIsChecked = true;
}
}
else if (checkBox2.Checked)
{
ms = msEvents.GetEventForMilestone("Shipping");
if (ms.Completed)
{
oneIsChecked = true;
}
}
else if (checkBox1.Checked)
{
ms = msEvents.GetEventForMilestone("Funding");
if (ms.Completed)
{
oneIsChecked = true;
}
}
if (!oneIsChecked)
continue;
string LO = loan.Fields["317"].FormattedValue;
string LOid = loan.Fields["3238"].FormattedValue;
string city = loan.Fields["313"].FormattedValue;
string address = loan.Fields["319"].FormattedValue;
string loanAmount = loan.Fields["2"].FormattedValue;
if (loanAmount == "")
{
Console.WriteLine(LO);
continue;
}
int numLoans = 1;
addLoanFieldToListView(LO, numLoans, city, address, loanAmount);
if (--count == 0)
break;
}
}
I haven't been able to figure out how to use any of the pipeline methods to retrieve data outside the reporting database, but when all of the fields I am looking for are in the reporting database, it hardly takes a couple seconds to retrieve the contents of hundreds of loans using these tools:
session.Reports.SelectReportingFieldsForLoans(loanGUIDs, fields);
session.Loans.QueryPipeline(selectedDate, PipelineSortOrder.None);
session.Loans.OpenPipeline(PipelineSortOrder.None);
What would really help me is if somebody provided a simple example for retrieving data outside of the reporting database by using the encompass sdk that doesn't take longer than it ought to for retrieving the data.
Note: I am aware I can add the fields to the reporting database that aren't in it currently, so this is not the answer I am looking for.
Note #2: Encompass360 doesn't have it's own tag, if somebody knows of better tags that can be added for the subject at hand, please add them.
I use the SelectFields method on Loans to retrieve loan field data that is not in the reporting database in Encompass. It is very performant compared to opening loans up one by one but the results are returned as strings so it requires some parsing to get the values in their native types. Below is the example from the documentation for using this method.
using System;
using System.IO;
using EllieMae.Encompass.Client;
using EllieMae.Encompass.BusinessObjects;
using EllieMae.Encompass.Query;
using EllieMae.Encompass.Collections;
using EllieMae.Encompass.BusinessObjects.Loans;
class LoanReader
{
public static void Main()
{
// Open the session to the remote server
Session session = new Session();
session.Start("myserver", "mary", "maryspwd");
// Build the query criterion for all loans that were opened this year
DateFieldCriterion dateCri = new DateFieldCriterion();
dateCri.FieldName = "Loan.DateFileOpened";
dateCri.Value = DateTime.Now;
dateCri.Precision = DateFieldMatchPrecision.Year;
// Perform the query to get the IDs of the loans
LoanIdentityList ids = session.Loans.Query(dateCri);
// Create a list of the specific fields we want to print from each loan.
// In this case, we'll select the Loan Amount and Interest Rate.
StringList fieldIds = new StringList();
fieldIds.Add("2"); // Loan Amount
fieldIds.Add("3"); // Rate
// For each loan, select the desired fields
foreach (LoanIdentity id in ids)
{
// Select the field values for the current loan
StringList fieldValues = session.Loans.SelectFields(id.Guid, fieldIds);
// Print out the returned values
Console.WriteLine("Fields for loan " + id.ToString());
Console.WriteLine("Amount: " + fieldValues[0]);
Console.WriteLine("Rate: " + fieldValues[1]);
}
// End the session to gracefully disconnect from the server
session.End();
}
}
You will highly benefit from adding these fields to the reporting DB and using RDB query instead. Internally, Encompass has to open / parse files when you read fields without RDB, which is a slow process. Yet it just does a SELECT query on fields in RDB which is a very fast process. This tool will allow you quickly checking / finding which fields are in RDB so that you can create a plan for your query as well as a plan to update RDB: https://www.encompdev.com/Products/FieldExplorer
You query RDB via Session.Loans.QueryPipeline() very similarly to your use of Loan Query. Here's a good example of source code (in VB): https://www.encompdev.com/Products/AlertCounterFieldPlugin
Related
I have more than 15000 POCO elements stored in a Redis List. I'm using ServiceStack in order to save and get them. However, I'm not pleased about the response times that I have when I get them into a grid. As I read , it would be better to store these object in hash - but unfortunately I could not find any good example for my case :(
This is the method I use, in order to get them into my grid
public IEnumerable<BookingRequestGridViewModel> GetAll()
{
try
{
var redisManager = new RedisManagerPool(Global.RedisConnector);
using (var redis = redisManager.GetClient())
{
var redisEntities = redis.As<BookingRequestModel>();
var result =redisEntities.Lists["BookingRequests"].GetAll().Select(z=> new BookingRequestGridViewModel
{
CreatedDate =z.CreatedDate,
DropOffBranchName =z.DropOffBranch !=null ? z.DropOffBranch.Name : string.Empty,
DropOffDate =z.DropOffDate,
DropOffLocationName = z.DropOffLocation != null ? z.DropOffLocation.Name : string.Empty,
Id =z.Id.Value,
Number =z.Number,
PickupBranchName =z.PickUpBranch !=null ? z.PickUpBranch.Name :string.Empty,
PickUpDate =z.PickUpDate,
PickupLocationName = z.PickUpLocation != null ? z.PickUpLocation.Name : string.Empty
}).OrderBy(z=>z.Id);
return result;
}
}
catch (Exception ex)
{
return null;
}
}
Note that I use redisEntities.Lists["BookingRequests"].GetAll() which is causing performance issues (I would like to use just redisEntities.Lists["BookingRequests"] but I lose last updates from grid - after editing)
I would like to know if saving them into list is a good approach as for me it's very important to have a fast grid (I have now 1 second at paging which is huge).
Please, advice!
Firstly you should not create a new Redis Client Manager like RedisManagerPool instance each time, there should only be a singleton instance of RedisManagerPool in your App which all clients are resolved from.
But otherwise I would rethink your data access strategy, downloading 15K items in a batch is not an ideal strategy. You can create indexes by storing ids in Sets or you could store items in a sorted set with a value that you can page against like an incrementing id, e.g:
var redisEntities = redis.As<BookingRequestModel>();
var bookings = redisEntities.SortedSets["bookings"];
foreach (var item in new BookingRequestModel[0])
{
redisEntities.AddItemToSortedSet(bookings, item, item.Id);
}
That way you will be able to fetch them in batches, e.g:
var batch = bookings.GetRangeByLowestScore(fromId, toId, skip, take);
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
I have an export job migrating data from an old database into a new database. The problem I'm having is that the user population is around 3 million and the job takes a very long time to complete (15+ hours). The machine I am using only has 1 processor so I'm not sure if threading is what I should be doing. Can someone help me optimize this code?
static void ExportFromLegacy()
{
var usersQuery = _oldDb.users.Where(x =>
x.status == 'active');
int BatchSize = 1000;
var errorCount = 0;
var successCount = 0;
var batchCount = 0;
// Using MoreLinq's Batch for sequences
// https://www.nuget.org/packages/MoreLinq.Source.MoreEnumerable.Batch
foreach (IEnumerable<users> batch in usersQuery.Batch(BatchSize))
{
Console.WriteLine(String.Format("Batch count at {0}", batchCount));
batchCount++;
foreach(var user in batch)
{
try
{
var userData = _oldDb.userData.Where(x =>
x.user_id == user.user_id).ToList();
if (userData.Count > 0)
{
// Insert into table
var newData = new newData()
{
UserId = user.user_id; // shortened code for brevity.
};
_db.newUserData.Add(newData);
_db.SaveChanges();
// Insert item(s) into table
foreach (var item in userData.items)
{
if (!_db.userDataItems.Any(x => x.id == item.id)
{
var item = new Item()
{
UserId = user.user_id, // shortened code for brevity.
DataId = newData.id // id from object created above
};
_db.userDataItems.Add(item);
}
_db.SaveChanges();
successCount++;
}
}
}
catch(Exception ex)
{
errorCount++;
Console.WriteLine(String.Format("Error saving changes for user_id: {0} at {1}.", user.user_id.ToString(), DateTime.Now));
Console.WriteLine("Message: " + ex.Message);
Console.WriteLine("InnerException: " + ex.InnerException);
}
}
}
Console.WriteLine(String.Format("End at {0}...", DateTime.Now));
Console.WriteLine(String.Format("Successful imports: {0} | Errors: {1}", successCount, errorCount));
Console.WriteLine(String.Format("Total running time: {0}", (exportStart - DateTime.Now).ToString(#"hh\:mm\:ss")));
}
Unfortunately, the major issue is the number of database round-trip.
You make a round-trip:
For every user, you retrieve user data by user id in the old database
For every user, you save user data in the new database
For every user, you save user data item in the new database
So if you say you have 3 million users, and every user has an average of 5 user data item, it mean you do at least 3m + 3m + 15m = 21 million database round-trip which is insane.
The only way to dramatically improve the performance is by reducing the number of database round-trip.
Batch - Retrieve user by id
You can quickly reduce the number of database round-trip by retrieving all user data at once and since you don't have to track them, use "AsNoTracking()" for even more performance gains.
var list = batch.Select(x => x.user_id).ToList();
var userDatas = _oldDb.userData
.AsNoTracking()
.Where(x => list.Contains(x.user_id))
.ToList();
foreach(var userData in userDatas)
{
....
}
You should already have saved a few hours only with this change.
Batch - Save Changes
Every time you save a user data or item, you perform a database round-trip.
Disclaimer: I'm the owner of the project Entity Framework Extensions
This library allows to perform:
BulkSaveChanges
BulkInsert
BulkUpdate
BulkDelete
BulkMerge
You can either call BulkSaveChanges at the end of the batch or create a list to insert and use directly BulkInsert instead for even more performance.
You will, however, have to use a relation to the newData instance instead of using the ID directly.
foreach (IEnumerable<users> batch in usersQuery.Batch(BatchSize))
{
// Retrieve all users for the batch at once.
var list = batch.Select(x => x.user_id).ToList();
var userDatas = _oldDb.userData
.AsNoTracking()
.Where(x => list.Contains(x.user_id))
.ToList();
// Create list used for BulkInsert
var newDatas = new List<newData>();
var newDataItems = new List<Item();
foreach(var userData in userDatas)
{
// newDatas.Add(newData);
// newDataItem.OwnerData = newData;
// newDataItems.Add(newDataItem);
}
_db.BulkInsert(newDatas);
_db.BulkInsert(newDataItems);
}
EDIT: Answer subquestion
One of the properties of a newDataItem, is the id of newData. (ex.
newDataItem.newDataId.) So newData would have to be saved first in
order to generate its id. How would I BulkInsert if there is a
dependency of an another object?
You must use instead navigation properties. By using navigation property, you will never have to specify parent id but set the parent object instance instead.
public class UserData
{
public int UserDataID { get; set; }
// ... properties ...
public List<UserDataItem> Items { get; set; }
}
public class UserDataItem
{
public int UserDataItemID { get; set; }
// ... properties ...
public UserData OwnerData { get; set; }
}
var userData = new UserData();
var userDataItem = new UserDataItem();
// Use navigation property to set the parent.
userDataItem.OwnerData = userData;
Tutorial: Configure One-to-Many Relationship
Also, I don't see a BulkSaveChanges in your example code. Would that
have to be called after all the BulkInserts?
Bulk Insert directly insert into the database. You don't have to specify "SaveChanges" or "BulkSaveChanges", once you invoke the method, it's done ;)
Here is an example using BulkSaveChanges:
foreach (IEnumerable<users> batch in usersQuery.Batch(BatchSize))
{
// Retrieve all users for the batch at once.
var list = batch.Select(x => x.user_id).ToList();
var userDatas = _oldDb.userData
.AsNoTracking()
.Where(x => list.Contains(x.user_id))
.ToList();
// Create list used for BulkInsert
var newDatas = new List<newData>();
var newDataItems = new List<Item();
foreach(var userData in userDatas)
{
// newDatas.Add(newData);
// newDataItem.OwnerData = newData;
// newDataItems.Add(newDataItem);
}
var context = new UserContext();
context.userDatas.AddRange(newDatas);
context.userDataItems.AddRange(newDataItems);
context.BulkSaveChanges();
}
BulkSaveChanges is slower than BulkInsert due to having to use some internal method from Entity Framework but still way faster than SaveChanges.
In the example, I create a new context for every batch to avoid memory issue and gain some performance. If you re-use the same context for all batchs, you will have millions of tracked entities in the ChangeTracker which is never a good idea.
Entity Framework is a very bad choice for importing large amounts of data. I know this from personal experience.
That being said, I found a few ways to optimize things when I tried to use it in the same way you are.
The Context will cache objects as you add them, and the more inserts you do, the slower future inserts will get. My solution was to limit each context to about 500 inserts before I disposed of that instance and created a new one. This boosted performance significantly.
I was able to make use of multiple threads to increase performance, but you will have to be very careful about resource contention. Each thread will definitely need its own Context, don't even think about trying to share it between threads. My machine had 8 cores, so threading will probably not help you as much; with a single core I doubt it will help you at all.
Turn off ChangeTracking with AutoDetectChangesEnabled = false;, change tracking is incredibly slow. Unfortunately this means you have to modify your code to make all changes directly through the context. No more Entity.Property = "Some Value";, it becomes Context.Entity(e=> e.Property).SetValue("Some Value"); (or something like that, I don't remember the exact syntax), which makes the code ugly.
Any queries you do should definitely use AsNoTracking.
With all that, I was able to cut a ~20 hour process down to about 6 hours, but I still don't recommend using EF for this. It was an extremely painful project due almost entirely to my poor choice of EF to add data. Please use something else... anything else...
I don't want to give the impression that EF is a bad data access library, it is great at what it was designed to do, unfortunately this is not what it was designed for.
I can think on a few options.
1) A little speed increase could be done by moving your _db.SaveChanges() under your foreach() close bracket
foreach (...){
}
successCount += _db.SaveChanges();
2) Add items to a list, and then to context
List<ObjClass> list = new List<ObjClass>();
foreach (...)
{
list.Add(new ObjClass() { ... });
}
_db.newUserData.AddRange(list);
successCount += _db.SaveChanges();
3) If it's a big amount of dada, save on bunches
List<ObjClass> list = new List<ObjClass>();
int cnt=0;
foreach (...)
{
list.Add(new ObjClass() { ... });
if (++cnt % 100 == 0) // bunches of 100
{
_db.newUserData.AddRange(list);
successCount += _db.SaveChanges();
list.Clear();
// Optional if a HUGE amount of data
if (cnt % 1000 == 0)
{
_db = new MyDbContext();
}
}
}
// Don't forget that!
_db.newUserData.AddRange(list);
successCount += _db.SaveChanges();
list.Clear();
4) If TOOOO big, considere using bulkinserts. There are a few examples on internet and a few free libraries around.
Ref: https://blogs.msdn.microsoft.com/nikhilsi/2008/06/11/bulk-insert-into-sql-from-c-app/
On most of these options you loose some control on error handling as it is difficult to know which one failed.
Using the query function of entity collection in C# and it takes a long time to load the related records back from SQL Server 2008. Is there any fast way to do this? This is the query function I use:
public void SearchProducts()
{
//Filter by search string array(searchArray)
List<string> prodId = new List<string>();
foreach (string src in searchArray)
{
StoreProductCollection prod = new StoreProductCollection();
prod.Query.Where(prod.Query.StptName.ToLower() == src.ToLower() && prod.Query.StptDeleted.IsNull());
prod.Query.Select(prod.Query.StptName, prod.Query.StptPrice, prod.Query.StptImage, prod.Query.StptStoreProductID);
// prod.Query.es.Top = 4;
prod.Query.Load();
if (prod.Count > 0)
{
foreach (StoreProduct stpt in prod)
{
if (!prodId.Contains(stpt.StptStoreProductID.ToString().Trim()))
{
prodId.Add(stpt.StptStoreProductID.ToString().Trim());
productObjectsList.Add(stpt);
}
}
}
}
You're hitting the database once per searchArray item, this is very wrong.
You might get better performance like this (have no way of testing it, give it a shot):
public void SearchProducts()
{
//Filter by search string array(searchArray)
List<string> prodId = new List<string>();
StoreProductCollection prod = new StoreProductCollection();
// Notice that your foreach() is gone
// replace this
// prod.Query.Where(prod.Query.StptName.ToLower() == src.ToLower() && prod.Query.StptDeleted.IsNull());
// with this (or something similar: point is, you should call .Load() exactly once)
prod.Query.where(prod.Query.StptDeleted.IsNull() && src.Any(srcArrayString => prod.Query.StptName.ToLower()==srcArrayString.ToLower());
prod.Query.Select(prod.Query.StptName, prod.Query.StptPrice, prod.Query.StptImage, prod.Query.StptStoreProductID);
// prod.Query.es.Top = 4;
prod.Query.Load();
// ... rest of your code follows.
}
Given List<string> searchArray containing lowered words :
public void SearchProducts()
{
//Filter by search string array(searchArray)
List<string> prodId = new List<string>();
StoreProductCollection prod = new StoreProductCollection();
prod.Query.Where(searchArray.Contains(prod.Query.StptName.ToLower()) && prod.Query.StptDeleted.IsNull());
prod.Query.Select(prod.Query.StptName, prod.Query.StptPrice, prod.Query.StptImage, prod.Query.StptStoreProductID);
// prod.Query.es.Top = 4;
prod.Query.Load();
if (prod.Count > 0)
{
foreach (StoreProduct stpt in prod)
{
if (!prodId.Contains(stpt.StptStoreProductID.ToString().Trim()))
{
prodId.Add(stpt.StptStoreProductID.ToString().Trim());
productObjectsList.Add(stpt);
}
}
}
}
This way you have only one query for all words.
First of all, put an index on StptName column.
Second, if you need even better performance, write a Stored Procedure in SQL, to do your querying, and map it with Entity Framework.
Let me know if you need explanation on how to do any of the above.
A couple more micro-optimizations you can do if you don't want to write a Stored Procedure:
Write src.ToLower() in a temporary varaible, and than compare prod.Query.StptName.ToLower() to it.
By default, SQL Server queries are case insensitive, so check if that's the case, and if so, you can get rid of the ToLower altogether. You can change case sensitivity through Collation.
EDIT:
To create an Index:
Open the table designer in SQL Server Managment Studio.
Right click anywhere and select Indexes/Keys.
Click Add.
Under Columns add StptName.
Under Is Unique specify whether StptName is unique or not.
Under type select "index".
That's all!
As for mapping stored procedures - here's a nice tutorial:
http://www.robbagby.com/entity-framework/entity-framework-modeling-select-stored-procedures/
(You can jump straight to the "Map in the Select Stored Procedure" Section).
This section simply reads from an excel spreadsheet. This part works fine with no performance issues.
IEnumerable<ImportViewModel> so=data.Select(row=>new ImportViewModel{
PersonId=(row.Field<string>("person_id")),
ValidationResult = ""
}).ToList();
Before I pass to a View I want to set ValidationResult so I have this piece of code. If I comment this out the model is passed to the view quickly. When I use the foreach it will take over a minute. If I hardcode a value for item.PersonId then it runs quickly. I know I'm doing something wrong, just not sure where to start and what the best practice is that I should be following.
foreach (var item in so)
{
if (db.Entity.Any(w => w.ID == item.PersonId))
{
item.ValidationResult = "Successful";
}
else
{
item.ValidationResult = "Error: ";
}
}
return View(so.ToList());
You are now performing a database call per item in your list. This is really hard on your database and thus your performance. Try to itterate trough your excel result, gather all users and select them in one query. Make a list from this query result (else the query call is performed every time you access the list). Then perform a match between the result list and your excel.
You need to do something like this :
var ids = so.Select(i=>i.PersonId).Distinct().ToList();
// Hitting Database just for this time to get all Users Ids
var usersIds = db.Entity.Where(u=>ids.Contains(u.ID)).Select(u=>u.ID).ToList();
foreach (var item in so)
{
if (usersIds.Contains(item.PersonId))
{
item.ValidationResult = "Successful";
}
else
{
item.ValidationResult = "Error: ";
}
}
return View(so.ToList());
Hi all i would like to use MySqlTransaction in my requirement. Actually i am having a doubt regarding that i.e as per my requirement i will have to delete different values from database.
The process i am doing is as follows. Assume that i am having 2 EmpIDs where this EmpID will hold different values which may be multiple. I will store the corresponding values for that particular EmpID using Dictionary and then i will save them to a list corresponding to the EmpID.
Assume that i am having list element as follows
For EmpID 1 i will have 1,2. I will check for the maximum value from the datbase in this list if exists i would like to delete this EmpID from the database.
For EmpID 2 i will have 1,2. But in my database i will have 3 as maximum values. So this one fails . I would like to rollback the previously deleted item .
Is it possible to do with a transaction if so can any one help me in solving this
Sample i code
if(findMax(lst,iEmpID)
{
obj.delete("storeprocname"); // this will occur when my list has maximum value
}
else
{
//Here i would like to rollback my previous one referring to the delete method in class file
}
My sample code
if (findMaxPayPeriodID(lstPayPeriodID, iEmpIDs)) //Assume for the first time maxpayperiod exists and for the second time it fails how to rollback then
{
if (findSequence(lstPayPeriodID)) // Assume this is also true for first time
{
for (int ilstPayperiodID = 0; ilstPayperiodID < lstPayPeriodID1.Count; ilstPayperiodID++)
{
oAdmin.Payperiodnumber = (int)lstPayPeriodID1[ilstPayperiodID];
for (int ilistPayYear = iPayYearcnt; ilistPayYear < lstPayYear1.Count; ilistPayYear++)
{
oAdmin.PayYear = (int)lstPayYear1[ilistPayYear];
iPayYearcnt++;
break;
}
for (int ilistDateTime = idtcnt; ilistDateTime < lstDateTime1.Count; ilistDateTime++)
{
idtcnt++;
oAdmin.PaymentDate = lstDateTime1[ilistDateTime];
break;
}
}
if (oAdmin.deletePayRoll(oSqlTran))
{
oMsg.Message = "Deleted Sucessfully";
oMsg.AlertMessageBox(out m_locallblMessage);
Page.Controls.Add(m_locallblMessage);
oAdmin.FedTaxID = ddlFedTaxID.SelectedValue;
oAdmin.PayFrequency = ddlPaymentType.SelectedValue.ToString();
mlocal_strStoredProcName = "uspSearchPayRoll";
oAdmin.getPayRollDetails(out mlocal_ds, mlocal_strStoredProcName);
//grdPayroll.Visible = true;
grdPayroll.DataSource = mlocal_ds;
grdPayroll.DataBind();
if (mlocal_ds != null)
{
btnDelete.Visible = true;
}
else
btnDelete.Visible = false;
}
lstPayPeriodID.Clear();
lstDateTime.Clear();
lstPayYear.Clear();
iPayIDcnt = 0;
iPayYearcnt = 0;
idtcnt = 0;
}
else
{
rollback should be done
}
You don't provide enough information - esp. since it seems that you will use a Stored Procedure for the delete operation all bets are off...
The only option I can think of is to make sure that you find first the maximum EmpId not from one list BUT from all lists first... then just check that against the DB and act accordingly...
This way the DB will only be hit twice (for the check and for the delete/Stored Procedure)... which is definetely better in terms of scaling etc.