Byt lets say I have an integer weight where i.e. elements with weight 10 has 10 times higher probability to be selected than element with weight 1.
var ws = db.WorkTypes
.Where(e => e.HumanId != null && e.SeoPriority != 0)
.OrderBy(e => /*????*/ * e.SeoPriority)
.Select(e => new
{
DescriptionText = e.DescriptionText,
HumanId = e.HumanId
})
.Take(take).ToArray();
How do I solved getting random records in Linq when I need the result to be weighted?
I need somthing like Random Weighted Choice in T-SQL but in linq and not only getting one record?
If I wouldn't have the weighted requirement, I'd use the NEWID approach, can I adopt this some way?
partial class DataContext
{
[Function(Name = "NEWID", IsComposable = true)]
public Guid Random()
{
throw new NotImplementedException();
}
}
...
var ws = db.WorkTypes
.Where(e => e.HumanId != null && e.SeoPriority != 0)
.OrderBy(e => db.Random())
.Select(e => new
{
DescriptionText = e.DescriptionText,
HumanId = e.HumanId
})
.Take(take).ToArray();
My first idea was the same as Ron Klein's - create a weighted list and select randomly from that.
Here's a LINQ extension method to create the weighted list from the normal list, given a lambda function that knows the weight property of the object.
Don't worry if you don't get all the generics stuff right away... The usage below should make it clearer:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
namespace ConsoleApplication1
{
public class Item
{
public int Weight { get; set; }
public string Name { get; set; }
}
public static class Extensions
{
public static IEnumerable<T> Weighted<T>(this IEnumerable<T> list, Func<T, int> weight)
{
foreach (T t in list)
for (int i = 0; i < weight(t); i++)
yield return t;
}
}
class Program
{
static void Main(string[] args)
{
List<Item> list = new List<Item>();
list.Add(new Item { Name = "one", Weight = 5 });
list.Add(new Item { Name = "two", Weight = 1 });
Random rand = new Random(0);
list = list.Weighted<Item>(x => x.Weight).ToList();
for (int i = 0; i < 20; i++)
{
int index = rand.Next(list.Count());
Console.WriteLine(list.ElementAt(index).Name);
}
Console.ReadLine();
}
}
}
As you can see from the output, the results are both random and weighted as you require.
I'm assuming that the weight is an integer. Here's an approach which joins to a dummy table to increase the row-count per the weight; first, lets prove it just at TSQL:
SET NOCOUNT ON
--DROP TABLE [index]
--DROP TABLE seo
CREATE TABLE [index] ([key] int not null) -- names for fun ;-p
CREATE TABLE seo (url varchar(10) not null, [weight] int not null)
INSERT [index] values(1) INSERT [index] values(2)
INSERT [index] values(3) INSERT [index] values(4)
INSERT [index] values(5) INSERT [index] values(6)
INSERT [index] values(7) INSERT [index] values(8)
INSERT [index] values(9) INSERT [index] values(10)
INSERT [seo] VALUES ('abc',1) INSERT [seo] VALUES ('def',2)
INSERT [seo] VALUES ('ghi',1) INSERT [seo] VALUES ('jkl',3)
INSERT [seo] VALUES ('mno',1) INSERT [seo] VALUES ('mno',1)
INSERT [seo] VALUES ('pqr',2)
DECLARE #count int, #url varchar(10)
SET #count = 0
DECLARE #check_rand TABLE (url varchar(10) not null)
-- test it lots of times to check distribution roughly matches weights
WHILE #count < 11000
BEGIN
SET #count = #count + 1
SELECT TOP 1 #url = [seo].[url]
FROM [seo]
INNER JOIN [index] ON [index].[key] <= [seo].[weight]
ORDER BY NEWID()
-- this to check distribution
INSERT #check_rand VALUES (#url)
END
SELECT ISNULL(url, '(total)') AS [url], COUNT(1) AS [hits]
FROM #check_rand
GROUP BY url WITH ROLLUP
ORDER BY url
This outputs something like:
url hits
---------- -----------
(total) 11000
abc 1030
def 1970
ghi 1027
jkl 2972
mno 2014
pqr 1987
Showing that we have the correct overall distribution. Now lets bring that into LINQ-to-SQL; I've added the two tables to a data-context (you will need to create something like the [index] table to do this) - my DBML:
<Table Name="dbo.[index]" Member="indexes">
<Type Name="index">
<Column Name="[key]" Member="key" Type="System.Int32" DbType="Int NOT NULL" CanBeNull="false" />
</Type>
</Table>
<Table Name="dbo.seo" Member="seos">
<Type Name="seo">
<Column Name="url" Type="System.String" DbType="VarChar(10) NOT NULL" CanBeNull="false" />
<Column Name="weight" Type="System.Int32" DbType="Int NOT NULL" CanBeNull="false" />
</Type>
</Table>
Now we'll consume this; in the partial class for the data-context, add a compiled-query (for performance) in addition to the Random method:
partial class MyDataContextDataContext
{
[Function(Name = "NEWID", IsComposable = true)]
public Guid Random()
{
throw new NotImplementedException();
}
public string GetRandomUrl()
{
return randomUrl(this);
}
static readonly Func<MyDataContextDataContext, string>
randomUrl = CompiledQuery.Compile(
(MyDataContextDataContext ctx) =>
(from s in ctx.seos
from i in ctx.indexes
where i.key <= s.weight
orderby ctx.Random()
select s.url).First());
}
This LINQ-to-SQL query is very similar to the key part of the TSQL we wrote; lets test it:
using (var ctx = CreateContext()) {
// show sample query
ctx.Log = Console.Out;
Console.WriteLine(ctx.GetRandomUrl());
ctx.Log = null;
// check distribution
var counts = new Dictionary<string, int>();
for (int i = 0; i < 11000; i++) // obviously a bit slower than inside db
{
if (i % 100 == 1) Console.WriteLine(i); // show progress
string s = ctx.GetRandomUrl();
int count;
if (counts.TryGetValue(s, out count)) {
counts[s] = count + 1;
} else {
counts[s] = 1;
}
}
Console.WriteLine("(total)\t{0}", counts.Sum(p => p.Value));
foreach (var pair in counts.OrderBy(p => p.Key)) {
Console.WriteLine("{0}\t{1}", pair.Key, pair.Value);
}
}
This runs the query once to show the TSQL is suitable, then (like before) 11k times to check the distribution; output (not including the progress updates):
SELECT TOP (1) [t0].[url]
FROM [dbo].[seo] AS [t0], [dbo].[index] AS [t1]
WHERE [t1].[key] <= [t0].[weight]
ORDER BY NEWID()
-- Context: SqlProvider(Sql2008) Model: AttributedMetaModel Build: 3.5.30729.4926
which doesn't look too bad at all - it has both tables and the range condition, and the TOP 1, so it is doing something very similar; data:
(total) 11000
abc 939
def 1893
ghi 1003
jkl 3104
mno 2048
pqr 2013
So again, we've got the right distribution, all from LINQ-to-SQL. Sorted?
Your suggested solution, as it seems from the question, is bound to Linq/Linq2Sql.
If I understand correctly, your main goal is to fetch at most X records from the database, that have a weight of more than 0. If the database holds more than X records, you'd like to choose from them using the record's weight, and have a random result.
If all is correct so far, my solution is to clone each record by its weight: if a record's weight is 5, make sure you have it 5 times. This way the random choice takes into account the weight.
However, cloning the records makes, well, duplications. So you can't just take X records, you should take more and more records until you have X distinct records.
So far I described a general solution, not related to the implementation.
I think it's harder to implement my solution using only Linq2Sql. If the total records count in the DB is not huge, I suggest reading the entire table and do the cloning and random outside the SQL Server.
If the total count is huge, I suggest you take, say, 100,000 records (or less) chosen at random (via Linq2Sql), and apply the implementation as above. I believe it's random enough.
Try by using the RAND() sql function - it'll give you a 0 to 1 float.
The downside is that I am not sure if it would cause a full table scan on the sql server side i.e. if the resulting query + execution on sql would be optimized in such a way that once you have the top n records it ignores the rest of the table.
var rand = new Random();
var ws = db.WorkTypes
.Where(e => e.HumanId != null && e.SeoPriority != 0)
.OrderByDescending(e => rand.Next() * e.SeoPriority)
.Select(e => new
{
DescriptionText = e.DescriptionText,
HumanId = e.HumanId
})
.Take(take).ToArray();
The reason the GUID (NEWID) function was being used in the SQL example you are looking at is simply that SQL Servers RAND function only calculates once per statement. So is useless in a randomising select.
But as your using linq, A quick and dirty solution is to create a Random object and replace your order by statement.
Random rand = new Random(DateTime.Now.Millisecond);
var ws = db.WorkTypes
.Where(e => e.HumanId != null && e.SeoPriority != 0)
.OrderByDescending(e => rand.Next(10) * e.SeoPriority)
.Select(e => new{ DescriptionText = e.DescriptionText, HumanId = e.HumanId})
.Take(take).ToArray();
The rand.Next(10) assumes your SeoPriority scales from 0 to 10.
It's not 100% acurate, but it's close, adjusting the Next value can tweak it.
Related
I have a database table with records for each user/year combination.
How can I get data from the database using EF and a list of userId/year combinations?
Sample combinations:
UserId Year
1 2015
1 2016
1 2018
12 2016
12 2019
3 2015
91 1999
I only need the records defined in above combinations. Can't wrap my head around how to write this using EF/Linq?
List<UserYearCombination> userYears = GetApprovedYears();
var records = dbcontext.YearResults.Where(?????);
Classes
public class YearResult
{
public int UserId;
public int Year;
public DateTime CreatedOn;
public int StatusId;
public double Production;
public double Area;
public double Fte;
public double Revenue;
public double Diesel;
public double EmissionsCo2;
public double EmissionInTonsN;
public double EmissionInTonsP;
public double EmissionInTonsA;
....
}
public class UserYearCombination
{
public int UserId;
public int Year;
}
This is a notorious problem that I discussed before here. Krishna Muppalla's solution is among the solutions I came up with there. Its disadvantage is that it's not sargable, i.e. it can't benefit from any indexes on the involved database fields.
In the meantime I coined another solution that may be helpful in some circumstances. Basically it groups the input data by one of the fields and then finds and unions database data by grouping key and a Contains query of group elements:
IQueryable<YearResult> items = null;
foreach (var yearUserIds in userYears.GroupBy(t => t.Year, t => t.UserId))
{
var userIds = yearUserIds.ToList();
var grp = dbcontext.YearResults
.Where(x => x.Year == yearUserIds.Key
&& userIds.Contains(x.UserId));
items = items == null ? grp : items.Concat(grp);
}
I use Concat here because Union will waste time making results distinct and in EF6 Concat will generate SQL with chained UNION statements while Union generates nested UNION statements and the maximum nesting level may be hit.
This query may perform well enough when indexes are in place. In theory, the maximum number of UNIONs in a SQL statement is unlimited, but the number of items in an IN clause (that Contains translates to) should not exceed a couple of thousands. That means that
the content of your data will determine which grouping field performs better, Year or UserId. The challenge is to minimize the number of UNIONs while keeping the number of items in all IN clauses below approx. 5000.
you can try this
//add the possible filters to LIST
var searchIds = new List<string> { "1-2015", "1-2016", "2-2018" };
//use the list to check in Where clause
var result = (from x in YearResults
where searchIds.Contains(x.UserId.ToString()+'-'+x.Year.ToString())
select new UserYearCombination
{
UserId = x.UserId,
Year = x.Year
}).ToList();
Method 2
var d = YearResults
.Where(x=>searchIds.Contains(x.UserId.ToString() + '-' + x.Year.ToString()))
.Select(x => new UserYearCombination
{
UserId = x.UserId,
Year = x.Year
}).ToList();
I'm running into a problem when updating some data via EF.
Let's say I have a table in my database:
Table T (ID int, Rank int, Name varchar)
I have a unique key constraint on Rank.
For example, I have this data in the table:
My C# object is something like this: Person (name, rank), so on the front end, a user wants to switch the rank of Joe and Mark.
When I make the update via EF, I get an error because of the unique key.
I suspect it is because dbContext.SaveChanges uses a update in this style:
UPDATE Table SET rank = 5 where Name = Joe
UPDATE Table SET rank = 1 where Name = Mark
With a SQL query I can perform this update by doing this:
Pass in User Defined table (rank, name) from C# side into query and then:
update T
set T.Rank = Updated.Rank
from Table T
inner join #UserDefinedTable Updated on T.Name = Temp.Name
and this does not trigger the unique key constraint
However I want to use EF for this operation, what do I do?
I've thought of these other solutions so far:
Delete old records, add "new" records from updated objects via EF
Dropping the unique constraint on database and writing a C# function to do the job of the unique constraint
Just use a SQL query like the example above instead of EF
Note: the table structure and data I used above is just an example
Any ideas?
Idea - you could make it as two steps operation(wrapped as single transaction)
1) set values for all entities that have to updated to negative(Joe, -1; Mark -5)
2) set to correct values (Joe, 5, Mark 1)
SQL Server's equivalent:
SELECT 1 AS ID, 1 AS [rank], 'Joe' AS name INTO t
UNION SELECT 2,2,'Ann'
UNION SELECT 3,5,'Mark'
UNION SELECT 4,7,'Sam';
CREATE UNIQUE INDEX uq ON t([rank]);
SELECT * FROM t;
/* Approach 1
UPDATE t SET [rank] = 5 where Name = 'Joe';
UPDATE t SET [rank] = 1 where Name = 'Mark';
Cannot insert duplicate key row in object 'dbo.t' with unique index 'uq'.
The duplicate key value is (5). Msg 2601 Level 14 State 1 Line 2
Cannot insert duplicate key row in object 'dbo.t' with unique index 'uq'.
The duplicate key value is (1).
*/
BEGIN TRAN
-- step 1
UPDATE t SET [rank] = -[rank] where Name = 'Joe';
UPDATE t SET [rank] = -[rank] where Name = 'Mark';
-- step 2
UPDATE t SET [rank] = 5 where Name = 'Joe'
UPDATE t SET [rank] = 1 where Name = 'Mark';
COMMIT;
db<>fiddle demo
You have focused a lot on the SQL side of this, but you can do the same thing in pure EF.
It will help next time to provide you EF code so we can provide you a more specific answer.
NOTE: do not use this logic in EF in scenarios where large sets of data will exist as the ReOrder process loads all records into memory, it is however useful for managing ordinality in child or sub lists that are scoped by an additional filter clause (so not for a whole table!)
The isolated ReOrder process is a good candidate on its own to go to the DB as a stored procedure if you need to do unique ranking logic across the entire table
There are two main variations here (for Unique Values):
Rank is must always be sequential/contiguous
This simplifies insert and replace logic, but you likely have to manage add, insert, swap and delete scenarios in the code.
Code to move items up and down in rank are very easy to implement
MUST manage deletes to re-compute the rank for all items
Rank can have gaps (not all values are contiguous)
This sounds like it should be easier, but to evaluate moving up and down in the list means you have to take take the gaps into account.
I wont post the code for this variation but be aware it is usually more complicated to maintain.
On the flip side you don't need to worry about actively managing deletes.
I use the following routine when the ordinal needs to be managed.
NOTE: This routine does not save the changes, it simply loads all the records that might be affected into memory so that we can correctly process the new ranking.
public static void ReOrderTableRecords(Context db)
{
// By convention do not allow the DB to do the ordering. this type of query will load missing DB values into the current dbContext,
// but will not replace the objects that are already loaded.
// The following query would be ordered by the original DB values:
// db.Table.OrderBy(x => x.Order).ToList()
// Instead we want to order by the current modified values in the db Context. This is a very important distinction which is why I have left this comment in place.
// So, load from the DB into memory and then order:
// db.Table[.Where(...optional filter by parentId...)].ToList().OrderBy(x => x.Order)
// NOTE: in this implementation we must also ensure that we don't include the items that have been flagged for deletion.
var currentValues = db.Table.ToList()
.Where(x => db.Entry(x).State != EntityState.Deleted)
.OrderBy(x => x.Rank);
int order = 1;
foreach (var item in currentValues)
item.Order = order++;
}
Lets say you can reduce your code to a function that Inserts a new item with a specific Rank into the list or you want to Swap the rank of two items in the list:
public static Table InsertItem(Context db, Table item, int? Rank = 1)
{
// Rank is optional, allows override of the item.Rank
if (Rank.HasValue)
item.Rank = Rank;
// Default to first item in the list as 1
if (item.Rank <= 0)
item.Rank = 1;
// re-order first, this will ensure no gaps.
// NOTE: the new item is not yet added to the collection yet
ReOrderTableRecords(db);
var items = db.Table.ToList()
.Where(x => db.Entry(x).State != EntityState.Deleted)
.Where(x => x.Rank >= item.Rank);
if (items.Any())
{
foreach (var i in items)
i.Rank = i.Rank + 1;
}
else if (item.Rank > 1)
{
// special case
// either ReOrderTableRecords(db) again... after adding the item to the table
item.Rank = db.Table.ToList()
.Where(x => db.Entry(x).State != EntityState.Deleted)
.Max(x => x.Rank) + 1;
}
db.Table.Add(item);
db.SaveChanges();
return item;
}
/// <summary> call this when Rank value is changed on a single row </summary>
public static void UpdateRank(Context db, Table item)
{
var rank = item.Rank;
item.Rank = -1; // move this item out of the list so it doesn't affect the ranking on reOrder
ReOrderTableRecords(db); // ensure no gaps
// use insert logic
var items = db.Table.ToList()
.Where(x => db.Entry(x).State != EntityState.Deleted)
.Where(x => x.Rank >= rank);
if (items.Any())
{
foreach (var i in items)
i.Rank = i.Rank + 1;
}
item.Rank = rank;
db.SaveChanges();
}
public static void SwapItemsByIds(Context db, int item1Id, int item2Id)
{
var item1 = db.Table.Single(x => x.Id == item1Id);
var item2 = db.Table.Single(x => x.Id == item2Id);
var rank = item1.Rank;
item1.Rank = item2.Rank;
item2.Rank = rank;
db.SaveChanges();
}
public static void MoveUpById(Context db, int item1Id)
{
var item1 = db.Table.Single(x => x.Id == item1Id);
var rank = item1.Rank - 1;
if (rank > 0) // Rank 1 is the highest
{
var item2 = db.Table.Single(x => x.Rank == rank);
item2.Rank = item1.Rank;
item1.Rank = rank;
db.SaveChanges();
}
}
public static void MoveDownById(Context db, int item1Id)
{
var item1 = db.Table.Single(x => x.Id == item1Id);
var rank = item1.Rank + 1;
var item2 = db.Table.SingleOrDefault(x => x.Rank == rank);
if (item2 != null) // item 1 is already the lowest rank
{
item2.Rank = item1.Rank;
item1.Rank = rank;
db.SaveChanges();
}
}
To ensure that Gaps are not introduced, you should call ReOrder after removing items from the table, but before calling SaveChanges()
Alternatively call ReOrder before each of Swap/MoveUp/MoveDown similar to insert.
Keep in mind that it is far simpler to allow duplicate Rank values, especially for large lists of data, but your business requirements will determine if this is a viable solution.
I am just learning LINQ and I have come across and issue Im not sure how to do in LINQ.
string numbers = "1,3,4,5";
string[] outletsInaStringArray = outlets.Split(',');
List<string> numbersAsAList = outletsInaStringArray.ToList();
I have a field in my database which holds a number. I only want to select the lines WHERE the number in the database is IN the line list of numbers "1,3,4,5" (these numbers are just examples).
Thanks in advance
I have looked at Tim and James answers and also looked at the line that James has sent. Im still a bit confused.....Sorry. Below is my actual code. It compiles but does not work
string outlets = "1,3,4,5"
string[] outletsNeeded = outlets.Split(',');
List<string> outletsNeededList = outletsNeeded.ToList();
DashboardEntities1 db = new DashboardEntities1();
var deptSalesQuery = (
from d in db.DashboardFigures
where (d.TypeOfinformation == "DEPTSALES") && (outletsNeeded.ToString().Contains(d.OutletNo.ToString()))
select new DeptSales
{
Dn = (int)d.Number,
Dnm = "Mens",
On = d.OutletNo,
Qs = (double)d.Value_4,
Se = (double)d.Value_2,
Si = (double)d.Value_3
}
);
In the DASHBAORDFIGURES table in SQL I have 2 records where the outlets number = 1, and therefore should have come up with two records.
Sorry if this is a simple thing, its just new to me and its frustrating.
You can use Contains as tagged:
var query = db.Table
.Where(x => outletsInaStringArray.Contains(x.Number) && x.information == "SALES");
that was method syntax, if you prefer query syntax:
var query = from figure in db.Figures
where outletsInaStringArray.Contains(figure.number)
&& figure.information == "SALES"
select figure;
But the column number is int, the List<string> stores strings, maybe your LINQ provider does not support .Contains(figure.ToString()). Then convert the strings to int first:
List<int> outletsNeededList = outletsNeeded.Select(int.Parse).ToList();
The answer that Tim provided is one method. Linq and lambda are interchangeable. Have a look at the following posting as well. Link
var result = from x in db.Table.ToList()
where outletsInaStringArray.Contains(x.Number)
select x;
Also have a look the following as it offers a very similar solution to the one you are looking for:
Link
As per i understand, you want to fetch data in similar way as IN (SQL) clause does it.
SELECT <Field_List>
FROM Table
WHERE IntegerField IN (1,2,4,5)
But i'm wondering why do you want to do it that way, when you can join data and get only matches. The worse is that you're trying to mix different data type and pass comma delimited text as a set of integers (i may be wrong):
SELECT <Field_List>
FROM Table
WHERE IntegerField IN ("1,2,4,5")
Above query won't execute, because the set of integers is "packed" into comma delimited string. To be able to execute that query, a conversion between data types must be done. Numbers in a string have to be converted to a set of integers (using user define split function or Common Table Expression):
;WITH CTE AS
(
--here convertion occurs
)
SELECT t2.<Field_List>
FROM CTE As t1 INNER JOIN TableName AS t2 ON t1.MyNumber = t2.IntegerField
Linq + any programming language is more flexible. You can build a list of integers (List) to build query.
See simple example:
void Main()
{
List<MyData> data = new List<MyData>{
new MyData(1,10),
new MyData(2, 11),
new MyData(5, 12),
new MyData(8, 13),
new MyData(12, 14)
};
//you're using comma delimited string
//string searchedNumbers = "1,3,4,5";
//var qry = from n in data
// join s in searchedNumbers.Split(',').Select(x=>int.Parse(x)) on n.ID equals s
// select n;
//qry.Dump();
List<int> searchedNumbers = new List<int>{1,2,4,5};
var qry = from n in data
join s in searchedNumbers on n.ID equals s
select n;
qry.Dump();
}
// Define other methods and classes here
class MyData
{
private int id = 0;
private int weight = 0;
public MyData(int _id, int _weight)
{
id = _id;
weight = _weight;
}
public int ID
{
get{return id;}
set {id = value;}
}
public int Weight
{
get{return weight;}
set {weight = value;}
}
}
Result:
ID Weight
1 10
5 12
Cheers
Maciej
Thank you all iv now got it to work using all your suggestions
the final code that works is as follows
DeptSales myDeptSales = new DeptSales(); // Single department
List<DeptSales> myDeptSalesList = new List<DeptSales>(); // List of Departments
DashboardEntities1 db = new DashboardEntities1();
var deptSalesQuery = from d in db.DashboardFigures
join s in outlets.Split(',').Select(x => int.Parse(x)) on d.OutletNo equals s
where (d.TypeOfinformation == "DEPTSALES")
select new DeptSales
{
Dn = (int)d.Number,
Dnm = "Mens",
On = d.OutletNo,
Qs = (double)d.Value_4,
Se = (double)d.Value_2,
Si = (double)d.Value_3
};
Thanks once again.
I need to do a query in c# to get the position of a specific id, in a table order by a date.
My table structure
IdAirport bigint
IdUser int
AddedDate datetime
Data:
2 5126 2014-10-23 14:54:32.677
2 5127 2014-10-23 14:55:32.677
1 5128 2014-10-23 14:56:32.677
2 5129 2014-10-23 14:57:32.677
For example, i need to know in which position is the IdUser=5129, in the IdAirport=2, order by AddedDate asc. (The result in this case will be 3).
Edit:
im using iQueryables like this:
AirPort airport = (for airport as context.Airport select airport).FirstOrDefault();
Thanks for your time!
Using LINQ: If you want to find the index of an element within an arbitrary order you can use OrderBy(), TakeWhile() and Count().
db.records.Where(x => x.IdAirport == airportId)
.OrderBy(x => x.AddedDate)
.TakeWhile(x => x.IdUser != userId)
.Count() + 1;
Here's a quick one :
public class test
{
public int IdAirport;
public int IdUser;
public DateTime AddedDate;
public test(int IdAirport, int IdUser, DateTime AddedDate)
{
this.IdAirport = IdAirport;
this.IdUser = IdUser;
this.AddedDate = AddedDate;
}
}
void Main()
{
List<test> tests = new List<test>()
{
new test(2, 5126, DateTime.Parse("2014-10-23 14:54:32.677")),
new test(2, 5127, DateTime.Parse("2014-10-23 14:55:32.677")),
new test(1 , 5128 , DateTime.Parse("2014-10-23 14:56:32.677")),
new test(2 , 5129 , DateTime.Parse("2014-10-23 14:57:32.677"))
};
var r = tests
.Where(t => t.IdAirport == 2)
.OrderBy(t => t.AddedDate)
.TakeWhile(t => t.IdUser != 5129)
.Count() + 1;
Console.WriteLine(r);
}
It keeps the exact order of your own list. You can modify Where/OrderBy if you wish, the interesting part is in the "TakeWhile/Count" use.
Should work fine but probably not very efficient for long lists.
edit : seems to be the same as Ian Mercer. But the "+ 1" in my own sample is needed since TakeWhile will return the number of skipped items, hence not the position of the good one. Or I didn't get well the issue.
This should do what you need:
dataTable.Rows.IndexOf(
dataTable.AsEnumerable().OrderBy(
x => x["AddedDateColumn"]).First(
x => (int)(x["IdUserColumn"]) == 5129));
I have an issue that I'll try to explain. My thought is to create a script in SSIS in C# and with that generate a list of IDs for each unique combination of IDs in a table.
I have a SQL server table which consists of two columns. The columns are IDs (I can make them numeric but in raw format they are alphanumeric strings). I want to generate a new ID out of the set of IDs in column 2 that are connected to column 1.
Col1 Col2 Generated ID
1 1
1 2 => 1
1 3
-----------
2 1 => 2
2 3
-----------
3 3
3 1 => 1
3 2
I'm thinking of a Hash function maybe? But how do I get the same ID out of the set for 1 and 3? Independent of order? Do I need to sort them first?
I needed "10 reputation" to post an image so I hope my illustration explains the issue...
As further examples to try to understand your problem, would you expect the following sets of values in Col2 to return something like '123' as the "Generated ID" value for all the listed cases below, like so?
Col2 => Generated ID
1,2,3 => 123
1,3,2 => 123
2,1,3 => 123
2,3,1 => 123
3,1,2 => 123
3,2,1 => 123
etc
If so, then based on the above assumptions and to answer your questions:
Yes, a Hash function could do it
How you get the same "Generated ID" for sets 1 and 3 (in your example) will depend on your GetHashCode() override/implementatio
Yes, you will probably need to sort, but again, that depends on your implementation.
Since you refer to using a C# script in SSIS, a possible C# implementation might be to implement a (very!) simple Hash class which given a set of Col2 values (for each data set), simply:
sorts the values for Col2 to get them in the 'right' order and
returns some integer representation of the sorted set of data to get the Hash (e.g., concatenate the int's as strings and then convert back to int)
The hash class could be instantiated in your (base?) class's GetHashCode() function, which is passed the Col2 values and performs steps (1) and (2) above, returning the hash code as needed.
Something like this might work for you (assuming you have access to Generics in the .NET version you're using):
namespace SimpleHashNamespace
{
public class SimpleHash
{
private readonly List<int> _data;
public SimpleHash(List<int> col2)
{
_data = col2;
}
public int GetMyHash()
{
_data.Sort();
string stringHash = string.Join("", _data);
return int.Parse(stringHash); // warning 1: assumes you always have a convertible value
}
}
public class MyDataSet
{
private readonly List<int> _dataSetValues;
public MyDataSet(List<int> dataSetValues)
{
_dataSetValues = dataSetValues;
}
public override int GetHashCode()
{
SimpleHash simpleHash = new SimpleHash(_dataSetValues);
return simpleHash.GetMyHash(); // warning 2: assumes the computed hash can fit into the int datatype given that GetHashCode() has to return int
}
}
public partial class Program
{
private static void Main(string[] args)
{
// how you split up Col1 to get your list of int's dataset is up to you
var myDataSet1 = new MyDataSet(new List<int>(new int[] { 1,2,3 }));
Console.WriteLine(myDataSet1.GetHashCode());
var myDataSet2 = new MyDataSet(new List<int>(new int[] { 2,1,3 }));
Console.WriteLine(myDataSet2.GetHashCode());
var myDataSet3 = new MyDataSet(new List<int>(new int[] { 3,2,1 }));
Console.WriteLine(myDataSet3.GetHashCode());
Console.ReadLine();
}
}
}
Obviously this is a trivial implementation however given the simplicity of the problem as it has been specified, perhaps this will suffice?
CREATE TABLE T (Col1 INT, Col2 INT);
GO
INSERT INTO [dbo].[T]([Col1],[Col2])
VALUES (1,1), (1,2), (1,3), (2,1), (2,3), (3,3), (3,1), (3,2), (2,3),(2,1);
GO
SELECT
T1.Col1,
(
SELECT Convert (VARCHAR,Col2) + ','
FROM T T2
WHERE T2.Col1 = T1.Col1
ORDER BY Col2
FOR XML PATH('')
) AS Col2_Cat
INTO X
FROM T T1
GROUP BY Col1 ;
SELECT T.Col1, T.Col2, Y.Col3
FROM T
INNER JOIN
(
SELECT X1.Col1, Min (X2.Col1) AS Col3 FROM X X1
----inner join X X2 on HASHBYTES ('SHA1',X1.Col2_Cat) = HASHBYTES('SHA1',X2.Col2_Cat)
inner join X X2 on X1.Col2_Cat = X2.Col2_Cat
GROUP BY X1.Col1
) AS Y
ON T.Col1 = Y.Col1;
DROP TABLE X
DROP TABLE T