c# Lambda and grouping - c#

trying to get my head around using Lambda expressions to fetch data from my database.
Say I have a table that looks a bit like this (notice the spaces and casing):
name, count:
iPhone 4, 15
iphone 4, 2
iPhone4, 8
If I try to find items by name (using StartsWith()), I only want to fetch the result with the highest count, independent of casing and spaces. So searches for "iphone4" "i p h o n e 4", "iPhone4" sholud all return the "iPhone 4"-record

If you have a MS Sql Server 2005+ the following would work for your stated example:
var inputString = "iPhone 4";
var token = inputString.ToLower().Replace(" ", "");
var tokenizedQuery = DataContext.Devices.Select(d => new { Device = d, Token = d.Name.ToLower().Replace(" ", "") });
var filteredQuery = tokenizedQuery.Where(d => d.Token == token);
var resultsQuery = filteredQuery.Select(d => d.Device).OrderByDescending(d => d.Count);
var result = resultsQuery.FirstOrDefault();
Here is what is going on:
You are creating a tokenized version of your input string by lower-casing it and then removing spaces.
Then you are creating a pseudo-column on your table to create a similar token column
Filter your results based on this token
Finally, select only the record with the highest count
However it is very important that you realize the ToLower() and Replace() methods are being translated to T-SQL commands that run on the sql server and not in your app. This means should you need more sophisticated tokenizing routines, or you are not using MS SQL this may not work!
As others have noted, you may want to clean up your design somewhat. You are essentially storing a key or search keyword that can have many permutations. Doing the tokenizing in a query is not portable or performant, so you should ideally store the tokenized version of this string in its own column. Alternatively, look into Full Text Indexes, as they may also address your problem (again, if using MSSQL).

Let's assume that you have a Collapse string extension, which is not hard to write. One thing you'll note is that there won't be a mapping from this to SQL so the final filtering will have to be done in LINQ to Objects. You might be able to make the DB query more efficient by doing partial filtering (i.e., on iphone), then complete the filtering in memory.
db.Table.ToList().Where( t => t.Name.Collapse().StartsWith( searchString.Collapse() )
.OrderByDescending( t => t.Count )
.Take( 1 );
Where Collapse is
public static class StringExtensions
{
public static string Collapse( this string source )
{
if (string.IsNullOrWhiteSpace( source ))
{
return string.Empty;
}
var builder = new StringBuilder();
foreach (char c in source)
{
if (!char.IsWhiteSpace( c ))
{
builder.Append( c );
}
}
return builder.ToString();
}
}
Note: you'd be better off sanitizing your database if possible AND you really want these to map to the same thing.

Related

Execute custom SQL before running FromSqlRaw in Entity Framework Core 6 or above in SQL Server

I only need it to work for SQL Server. This is an example. The question is about a general approach.
There is a nice extension method from https://entityframework-extensions.net called WhereBulkContains. It is, sort of, great, except that the code of the methods in this library is obfuscated and they do not produce valid SQL when .ToQueryString() is called on IQueryable<T> with these extension methods applied.
Subsequently, I can't use such methods in production code as I am not "allowed" to trust such code due to business reasons. Sure, I can write tons of tests to ensure that WhereBulkContains works as expected, except that there are some complicated cases where the performance of WhereBulkContains is well below stellar, whereas properly written SQL works in a blink of an eye. And (read above), since the code of this library is obfuscated, there is no way to figure out what's wrong there without spending a large amount of time. We would've bought the license (as this is not a freeware) if the library weren't obfuscated. All together that basically kills the library for our purposes.
This is where it gets interesting. I can easily create and populate a temporary table, e.g. (I have a table called EFAgents with an int PK called AgentId in the database):
private string GetTmpAgentSql(IEnumerable<int> agentIds) => #$"
drop table if exists #tmp_Agents;
create table #tmp_Agents (AgentId int not null, primary key clustered (AgentId asc));
{(agentIds
.Chunk(1_000)
.Select(e => $#"
insert into #tmp_Agents (AgentId)
values
({e.JoinStrings("), (")});
")
.JoinStrings(""))}
select 0 as Result
";
private const string AgentSql = #"
select a.* from EFAgents a inner join #tmp_Agents t on a.AgentID = t.AgentId";
where GetContext returns EF Core database context and JoinStrings comes from Unity.Interception.Utilities and then use it as follows:
private async Task<List<EFAgent>> GetAgents(List<int> agentIds)
{
var tmpSql = GetTmpAgentSql(agentIds);
using var ctx = GetContext();
// This creates a temporary table and populates it with the ids.
// This is a proprietary port of EF SqlQuery code, but I can post the whole thing if necessary.
var _ = await ctx.GetDatabase().SqlQuery<int>(tmpSql).FirstOrDefaultAsync();
// There is a DbSet<EFAgent> called Agents.
var query = ctx.Agents
.FromSqlRaw(AgentSql)
.Join(ctx.Agents, t => t.AgentId, a => a.AgentId, (t, a) => a);
var sql = query.ToQueryString() + Environment.NewLine;
// This should provide a valid SQL; https://entityframework-extensions.net does NOT!
// WriteLine - writes to console or as requested. This is irrelevant to the question.
WriteLine(sql);
var result = await query.ToListAsync();
return result;
}
So, basically, I can do what I need in two steps:
using var ctx = GetContext();
// 1. Create a temp table and populate it - call GetTmpAgentSql.
...
// 2. Build the join starting from `FromSqlRaw` as in example above.
This is doable, half-manual, and it is going to work.
The question is how to do that in one step, e.g., call:
.WhereMyBulkContains(aListOfIdConstraints, whateverElseIsneeded, ...)
and that's all.
I am fine if I need to pass more than one parameter in each case in order to specify the constraints.
To clarify the reasons why do I need to go into all these troubles. We have to interact with a third party database. We don't have any control of the schema and data there. The database is large and poorly designed. That resulted in some ugly EFC LINQ queries. To remedy that, some of that ugliness was encapsulated into a method, which takes IQueryable<T> (and some more parameters) and returns IQueryable<T>. Under the hood this method calls WhereBulkContains. I need to replace this WhereBulkContains by, call it, WhereMyBulkContains, which would be able to provide correct ToQueryString representation (for debugging purposes) and be performant. The latter means that SQL should not contain in clause with hundreds (and even sometimes thousands) of elements. Using inner join with a [temp] table with a PK and having an index on the FK field seem to do the trick if I do that in pure SQL. But, ... I need to do that in C# and effectively in between two LINQ method calls. Refactoring everything is also not an option because that method is used in many places.
Thanks a lot!
I think you really want to use a Table Valued Parameter.
Creating an SqlParameter from an enumeration is a little fiddly, but not too difficult to get right;
CREATE TYPE [IntValue] AS TABLE (
Id int NULL
)
private IEnumerable<SqlDataRecord> FromValues(IEnumerable<int> values)
{
var meta = new SqlMetaData(
"Id",
SqlDbType.Int
);
foreach(var value in values)
{
var record = new SqlDataRecord(
meta
);
record.SetInt32(0, value);
yield return record;
}
}
public SqlParameter ToIntTVP(IEnumerable<int> values){
return new SqlParameter()
{
TypeName = "IntValue",
SqlDbType = SqlDbType.Structured,
Value = FromValues(values)
};
}
Personally I would define a query type in EF Core to represent the TVP. Then you can use raw sql to return an IQueryable.
public class IntValue
{
public int Id { get; set; }
}
modelBuilder.Entity<IntValue>(e =>
{
e.HasNoKey();
e.ToView("IntValue");
});
IQueryable<IntValue> ToIntQueryable(DbContext ctx, IEnumerable<int> values)
{
return ctx.Set<IntValue>()
.FromSqlInterpolated($"select * from {ToIntTVP(values)}");
}
Now you can compose the rest of your query using Linq.
var ids = ToIntQueryable(ctx, agentIds);
var query = ctx.Agents
.Where(a => ids.Any(i => i.Id == a.Id));
I would propose to use linq2db.EntityFrameworkCore (note that I'm one of the creators). It has built-in temporary tables support.
We can create simple and reusable function which filters records of any type:
public static class HelperMethods
{
private class KeyHolder<T>
{
[PrimaryKey]
public T Key { get; set; } = default!;
}
public static async Task<List<TEntity>> GetRecordsByIds<TEntity, TKey>(this IQueryable<TEntity> query, IEnumerable<TKey> ids, Expression<Func<TEntity, TKey>> keyFunc)
{
var ctx = LinqToDBForEFTools.GetCurrentContext(query) ??
throw new InvalidOperationException("Query should be EF Core query");
// based on DbContext options, extension retrieves connection information
using var db = ctx.CreateLinqToDbConnection();
// create temporary table and BulkCopy records into that table
using var tempTable = await db.CreateTempTableAsync(ids.Select(id => new KeyHolder<TKey> { Key = id }), tableName: "temporaryIds");
var resultQuery = query.Join(tempTable, keyFunc, t => t.Key, (q, t) => q);
// we use ToListAsyncLinqToDB to avoid collission with EF Core async methods.
return await resultQuery.ToListAsyncLinqToDB();
}
}
Then we can rewrite your function GetAgents to the following:
private async Task<List<EFAgent>> GetAgents(List<int> agentIds)
{
using var ctx = GetContext();
var result = await ctx.Agents.GetRecordsByIds(agentIds, a => a.AgentId);
return result;
}

How to fetch data from MongoDB collection in C# using Regular Expression?

I am using MongoDB.Drivers nuget package in my MVC (C#) web application to communication with MongoDB database. Now, I want to fetch data based on specific column and it's value. I used below code to fetch data.
var findValue = "John";
var clientTest1 = new MongoClient("mongodb://localhost:XXXXX");
var dbTest1 = clientTest1.GetDatabase("Temp_DB");
var empCollection = dbTest1.GetCollection<Employee>("Employee");
var builder1 = Builders<Employee>.Filter;
var filter1 = builder1.Empty;
var regexFilter = new BsonRegularExpression(findValue, "i");
filter1 = filter1 & builder1.Regex(x => x.FirstName, regexFilter);
filter1 = filter1 & builder1.Eq(x => x.IsDeleted,false);
var collectionObj = await empCollection.FindAsync(filter1);
var dorObj = collectionObj.FirstOrDefault();
But, the above code is performing like query.
It means it is working as (select * from Employee where FirstName like '%John%') I don't want this. I want to fetch only those data whose FirstName value should match exact. (like in this case FirstName should equal John).
How can I perform this, can anyone provide me suggestions on this.
Note: I used new BsonRegularExpression(findValue, "i") to make search case-insensitive.
Any help would be highly appreciated.
Thanks
I would recommend storing a normalized version of your data, and index/search upon that. It will likely be considerably faster than using regex. Sure, you'll eat up a little more storage space by including "john" alongside "John", but your data access will be faster since you would just be able to use a standard $eq query.
If you insist on regex, I recommend using ^ (start of line) and $ (end of line) around your search term. Remember though, that you should escape your find value so that its contents isn't treated as RegEx.
This should work:
string escapedFindValue = System.Text.RegularExpressions.Regex.Escape(findValue);
new BsonRegularExpression(string.Format("^{0}$", escapedFindValue), "i");
Or if you're using a newer framework version, you can use string interpolation:
string escapedFindValue = System.Text.RegularExpressions.Regex.Escape(findValue);
new BsonRegularExpression($"^{escapedFindValue}$", "i");

Any .NET class that makes a query string given key value pairs or the like

Is there a class in the .NET framework which, if given a sequence/enumerable of key value pairs (or anything else I am not rigid on the format of this) can create a query string like this:
?foo=bar&gar=har&print=1
I could do this trivial task myself but I thought I'd ask to save myself from re-inventing the wheel. Why do all those string gymnastics when a single line of code can do it?
You can use System.Web.HttpUtility.ParseQueryString to create an empty System.Web.HttpValueCollection, and use it like a NameValueCollection.
Example:
var query = System.Web.HttpUtility.ParseQueryString(string.Empty);
query ["foo"] = "bar";
query ["gar"] = "har";
query ["print"] = "1";
var queryString = query.ToString(); // queryString is 'foo=bar&gar=har&print=1'
There's nothing built in to the .NET framework, as far as I know, though there are a lot of almosts.
System.Web.HttpRequest.QueryString is a pre-parsed NameValueCollection, not something that can output a querystring. System.NetHttpWebRequest expects you to pass a pre-formed URI, and System.UriBuilder has a Query property, but again, expects a pre-formed string for the entire query string.
However, running a quick search for "querystringbuilder" shows a couple of implementations for this out in the web that could serve. One such is this one by Brad Vincent, which gives you a simple fluent interface:
//take an existing string and replace the 'id' value if it exists (which it does)
//output : "?id=5678&user=tony"
strQuery = new QueryString("id=1234&user=tony").Add("id", "5678", true).ToString();
And, though not exactly very elegant, I found a method in RestSharp, as suggested by #sasfrog in the comment to my question. Here's the method.
From RestSharp-master\RestSharp\Http.cs
private string EncodeParameters()
{
var querystring = new StringBuilder();
foreach (var p in Parameters)
{
if (querystring.Length > 0)
querystring.Append("&");
querystring.AppendFormat("{0}={1}", p.Name.UrlEncode(), p.Value.UrlEncode());
}
return querystring.ToString();
}
And again, not very elegant and not really what I would have been expecting, but hey, it gets the job done and saves me some typing.
I was really looking for something like Xi Huan's answer (marked the correct answer) but this works as well.
There's nothing built into the .NET Framework that preserves the order of the query parameters, AFAIK. The following helper does that, skips null values, and converts values to invariant strings. When combined with a KeyValueList, it makes building URIs pretty easy.
public static string ToUrlQuery(IEnumerable<KeyValuePair<string, object>> pairs)
{
var q = new StringBuilder();
foreach (var pair in pairs)
if (pair.Value != null) {
if (q.Length > 0) q.Append('&');
q.Append(pair.Key).Append('=').Append(WebUtility.UrlEncode(Convert.ToString(pair.Value, CultureInfo.InvariantCulture)));
}
return q.ToString();
}

Possible SQL-Injection with user defined functions and entity framework?

My ASP.NET MVC 4 application uses MS-SQL user defined functions to do a fulltext search.
I followed this post and created following code:
in Model Class:
if (suchstring.Trim() != "")
{
//search for each piece separated by space:
var such = suchstring.Split(' ');
int index = 0;
foreach (string teil in such)
{
index++;
if (teil.Trim() != "")
{
res = res.Join(db.udf_FirmenSucheMultiple(string.Format("\"{0}*\"", teil), index), l => l.ID, s => s.KEY, (l, s) => l);
}
}
}
Mapping function:
[EdmFunction("TQCRMEntities", "udf_AnsprechpartnerFirmaSuche")]
public virtual IQueryable<udf_AnsprechpartnerFirmaSuche_Result> udf_AnsprechpartnerFirmaSucheMultiple(string keywords, int index)
{
string param_name = String.Format("k_{0}", index);
var keywordsParameter = keywords != null ?
new ObjectParameter(param_name, keywords) :
new ObjectParameter(param_name, typeof(string));
return ((IObjectContextAdapter)this).
ObjectContext.CreateQuery<udf_AnsprechpartnerFirmaSuche_Result>(
String.Format("[TQCRMEntities].[udf_AnsprechpartnerFirmaSuche](#{0})", param_name), keywordsParameter);
}
SQL User defined function:
create function udf_AnsprechpartnerFirmaSuche
(#keywords nvarchar(4000))
returns table
as
return (select [KEY], [rank] from containstable(AnsprechpartnerFirma, *, #keywords, LANGUAGE 1031))
If I try to search for " I get a 500 Server Error (Syntaxerror from the SQLServer).
My question is if my app is vulnerable to SQL injections and how I should protect against them.
Is it save to just remove * and " from the input?
From http://msdn.microsoft.com/en-us/library/ms189760.aspx
CONTAINSTABLE is used in the FROM clause of a Transact-SQL SELECT statement and is referenced as if it were a regular table name. .It performs a SQL Server full-text search on full-text indexed columns containing character-based data types.
If you read Quassnoi's answer with regard to searching the full-text index for double quotes:
Punctuation is ignored. Therefore, CONTAINS(testing, "computer failure") matches a row with the value, "Where is my computer? Failure to find it would be expensive."
Documentation can be found here.
See his answer for an alternative using the LIKE operator.
To answer your questions:
My question is if my app is vulnerable to SQL injections and how I should protect against them.
You are using parameters properly in your UDF. It should be safe from SQL injection.
Is it save to just remove * and " from the input?
No. Never try to blacklist characters in an attempt to prevent SQL injection. You will almost certainly fail.
See OWASP SQL Injection Prevention for details.

MongoDB C# Driver multiple field query

Using the MongoDB C# driver How can I include more than one field in the query (Im using vb.net)
I know how to do (for name1=value1)
Dim qry = Query.EQ("name1","value1")
How can I modify this query so I can make it find all documents where name1=value1 and name2=value2?
( Similar to )
db.collection.find({"name1":"value1","name2":"value2"})
I wanted to search a text in different fields and Full Text Search doesn't work for me even after wasting so much time. so I tried this.
var filter = Builders<Book>.Filter.Or(
Builders<Book>.Filter.Where(p=>p.Title.ToLower().Contains(queryText.ToLower())),
Builders<Book>.Filter.Where(p => p.Publisher.ToLower().Contains(queryText.ToLower())),
Builders<Book>.Filter.Where(p => p.Description.ToLower().Contains(queryText.ToLower()))
);
List<Book> books = Collection.Find(filter).ToList();
You can use:
var arrayFilter = Builders<BsonDocument>.Filter.Eq("student_id", 10000)
& Builders<BsonDocument>.Filter.Eq("scores.type", "quiz");
Reference: https://www.mongodb.com/blog/post/quick-start-csharp-and-mongodb--update-operation
And doesn't always do what you want (as I found was the case when doing a not operation on top of an and). You can also create a new QueryDocument, as shown below. This is exactly the equivalent of what you were looking for.
Query.Not(new QueryDocument {
{ "Results.Instance", instance },
{ "Results.User", user.Email } }))

Categories

Resources