I am a beginner to coding in U-SQL/C#. I am stuck in a place during windowing/aggregation.
My Data looks like
Name Date OrderNo Type Balance
one 2018-06-25T04:55:44.0020987Z 1 Drink 15
one 2018-06-25T04:57:44.0020987Z 1 Drink 70
one 2018-06-25T04:59:44.0020987Z 1 Drink 33
one 2018-06-25T04:59:49.0020987Z 1 Drink 25
two 2018-06-25T04:55:44.0020987Z 2 Drink 22
two 2018-06-25T04:57:44.0020987Z 2 Drink 81
two 2018-06-25T04:58:44.0020987Z 2 Drink 33
two 2018-06-25T04:59:44.0020987Z 2 Drink 45
In U-SQL I am adding a unique id based on combinations of name, orderno and type and for the purpose of sorting, I am adding another one including the date.
#files =
EXTRACT
name string,
date DateTime,
type string,
orderno int,
balance int
FROM
#InputFile
USING new JsonExtractor();
#files2 =
SELECT *,
DENSE_RANK() OVER(ORDER BY name,type,orderno,date) AS group_id,
DENSE_RANK() OVER(ORDER BY name,type,orderno) AS id
FROM #files;
My Data now looks like this:
Name Date OrderNo Type Balance group_id id
one 2018-06-25T04:55:44.0020987Z 1 Drink 15 1 1
one 2018-06-25T04:57:44.0020987Z 1 Drink 70 2 1
one 2018-06-25T04:59:44.0020987Z 1 Drink 33 3 1
one 2018-06-25T04:59:49.0020987Z 1 Drink 25 4 1
two 2018-06-25T04:55:44.0020987Z 2 Drink 22 5 2
two 2018-06-25T04:57:44.0020987Z 2 Drink 81 6 2
two 2018-06-25T04:58:44.0020987Z 2 Drink 33 7 2
two 2018-06-25T04:59:44.0020987Z 2 Drink 45 8 2
(I have added only 4 records per group but there are multiple per group)
I am stuck at determining the difference between successive rows in the balance column in each group.
Expected Output for Part 1:
Name Date OrderNo Type Balance group_id id increase
one 2018-06-25T04:55:44.0020987Z 1 Drink 15 1 1 0
one 2018-06-25T04:57:44.0020987Z 1 Drink 70 2 1 55
one 2018-06-25T04:59:44.0020987Z 1 Drink 33 3 1 -37
one 2018-06-25T04:59:49.0020987Z 1 Drink 25 4 1 -8
two 2018-06-25T04:55:44.0020987Z 2 Drink 22 5 2 0
two 2018-06-25T04:57:44.0020987Z 2 Drink 81 6 2 59
two 2018-06-25T04:58:44.0020987Z 2 Drink 33 7 2 -48
two 2018-06-25T04:59:44.0020987Z 2 Drink 45 8 2 8
For every new group (defined by id) the increase should start from zero.
I went through stack overflow and saw the lag function from transgresql. I could not find a C# equivalent. Is that applicable in this case?
Any help is appreciated. Further clarification will be provided if required.
Update: When I use CASE WHEN my solution looks like this
CURRENT OUTPUT DESIRED OUTPUT
id Balance Increase id Balance Increase
1 15 0 1 15 0
1 70 55 1 70 55
1 33 -37 1 33 -37
1 25 -8 1 25 -8
2 22 "-3" 2 22 "0"
2 81 59 2 81 59
2 33 -48 2 33 -48
2 45 12 2 45 12
Look at the highlighted row. The increase column must start at 0 for each id.
Update: I was able to solve the first part of my question. See my answer below.
The second part that I had posted earlier was incorrectly posted. I have removed that.
You can try to use LAG window function get previous Balance in a subquery, then use where write the condition.
SELECT * FROM (
SELECT *,
DENSE_RANK() OVER(ORDER BY name,type,orderno,date) AS group_id,
DENSE_RANK() OVER(ORDER BY name,type,orderno) AS id,
(CASE WHEN LAG(Balance) OVER(ORDER BY name,type,orderno) IS NULL THEN 0
ELSE Balance - LAG(Balance) OVER(ORDER BY name,type,orderno)
END) as increase
FROM #files
) t1
WHERE increase > 50
The query that finally worked for me was this..
#files =
EXTRACT
name string,
date DateTime,
type string,
orderno int,
balance int
FROM
#InputFile
USING new JsonExtractor();
#files2 =
SELECT *,
DENSE_RANK() OVER(ORDER BY name,type,orderno) AS group_id
FROM #files;
#files3 =
SELECT *,
DENSE_RANK() OVER(PARTITION BY group_id ORDER BY date) AS group_order
FROM #files2;
#files4 =
SELECT *,
(CASE WHEN group_order == 1 THEN 0
ELSE balance - LAG(balance) OVER(ORDER BY name,type,orderno)
END) AS increase
FROM #files3;
I have read many question on Stack Overflow related to my problem, but I don't think they quite address my problem. Basically I download a XML dataset with lots of data, and inserted that data into my MS Access database. What I want to do is convert the data so that some specific rows become columns.
Now I can probably do this manually in code before inserting the data to database, but that would require lots of time and change in code, so I'm wondering if its possible to do this with MS Access.
Here's how my table basically looks, and how I want to convert it.
The index is not so relevant in my case
[Table1] => [Table1_converted]
[Index] [Name] [Data] [NameID] [NameID] [AA] [BB] [CC] [DD]
1 AA 14 1 1 14 date1 64 61
2 BB(date) 42 1 2 15+19 date2 67+21 63+12
3 CC 64 1 3 9 10
4 DD 61 1 4 date4 1 87
5 AA 15 2
6 BB(date) 35 2
7 CC 67 2
8 DD 63 2
9 AA 9 3
10 CC 10 3
11 AA 19 2
12 BB(date) 20 2
13 CC 21 2
14 DD 12 2
15 BB(date) 83 4
16 CC 1 4
17 DD 87 4
Forgot to mention that, the Values under the column [Name] are not really AA BB CC.
They are more complex then that. AA is actually like "01 - NameAA", without the quotation mark.
Forgot to mention one important element in my question, if the [Name] ex. AA with same [NameID] exists in table, then the [Data] should SUM up those two values. I have edited the tables, on the converted table i have written ex. 15+19 or 35+20 which only illustrates which values are summed up.
One more edit, hopefully the last. One of the [Name] BB has a Datetime type in [Data].
The NameID can be whichever, does not matter. So i need a query which does an exception on [Name] BB when its summing up, so that it does not sum it up like it does to every other [Name]s [Data]. Places where date is written multiple times for same [Name] and [NameID], it is always the same.
To accomplish this in Access, all you need to do is
TRANSFORM Sum([Data]) AS SumOfData
SELECT [NameID]
FROM [Table1]
GROUP BY [NameID]
PIVOT [Name]
edit re: revised question
To handle some [Name]s differently we would need to assemble the results (Sum()s, etc.) first, and then crosstab the results
For test data in [Table1]:
Index Name Data NameID
----- ---- ---------- ------
1 AA 14 1
2 BB 2013-12-01 1
3 CC 64 1
4 DD 61 1
5 AA 15 2
6 BB 2013-12-02 2
7 CC 67 2
8 DD 63 2
9 AA 9 3
10 CC 10 3
11 AA 19 2
12 BB 2013-12-02 2
13 CC 21 2
14 DD 12 2
15 BB 2013-12-04 4
16 CC 1 4
17 DD 87 4
the query
TRANSFORM First(columnData) AS whatever
SELECT [NameID]
FROM
(
SELECT [NameID], [Name], Sum([Data]) AS columnData
FROM [Table1]
WHERE [Name] <> 'BB'
GROUP BY [NameID], [Name]
UNION ALL
SELECT DISTINCT [NameID], [Name], [Data]
FROM [Table1]
WHERE [Name] = 'BB'
)
GROUP BY [NameID]
PIVOT [Name]
produces
NameID AA BB CC DD
------ -- ---------- -- --
1 14 2013-12-01 64 61
2 34 2013-12-02 88 75
3 9 10
4 2013-12-04 1 87
Try this...in sql query may be it is your answer
SELECT NameID , [AA] as AA,[BB] as BB,[CC] as CC,[DD] as DD
FROM
(
SELECT Name,Data,NameID FROM Table1
)PivotData
PIVOT
(
max(Data) for Name in ([AA],[BB],[CC],[DD])
) AS Pivoting
I think you need to this
1) Take all your Table1 as it is in SQL Server
2) Then run following query
DECLARE #cols AS NVARCHAR(MAX),
#query AS NVARCHAR(MAX)
select #cols = STUFF((SELECT distinct ',' + QUOTENAME(Name)
from [Table1]
FOR XML PATH(''), TYPE
).value('.', 'NVARCHAR(MAX)')
,1,1,'')
set #query = 'SELECT countryid,' + #cols + '
from
(
select NameID, Name
from Table1 cc
) T
pivot
(
max (Name)
for languagename in (' + #cols + ')
) p '
execute sp_executesql #query;
DECLARE #Table1 TABLE ([Index] INT,[Name] CHAR(2),[Data] INT,[NameID] INT)
INSERT INTO #Table1
VALUES
(1,'AA',14,1),
(2,'BB',42,1),
(3,'CC',64,1),
(4,'DD',61,1),
(5,'AA',15,2),
(6,'BB',35,2),
(7,'CC',67,2),
(8,'DD',63,2),
(9,'AA',9,3),
(10,'CC',10,3),
(11,'BB',83,4),
(12,'CC',1,4),
(13,'DD',87,4)
SELECT [NameID] , ISNULL([AA], '') AS [AA], ISNULL([BB], '') AS [BB]
, ISNULL([CC], '') AS [CC], ISNULL([DD], '') AS [DD]
FROM
(
SELECT NAME, DATA, NAMEID
FROM #Table1
)q
PIVOT
(
SUM(DATA)
FOR NAME
IN ([AA], [BB], [CC], [DD])
)P
Result Set
NameID AA BB CC DD
1 14 42 64 61
2 15 35 67 63
3 9 10
4 83 1 87
This question already exists:
Closed 10 years ago.
Possible Duplicate:
.Union() changes the order of the items?
According to this question (without scenario/example; you should remove it, I can't) this is my problem :
I've noticed that if I do a Union, than an Intersect between collections Attachment[], the order of the Items "can" change.
This is my code :
GalleryDataClassesDataContext db = new GalleryDataClassesDataContext();
List<Attachment> Allegati = db.ExecuteQuery<Attachment>("EXEC SelectAttachmentsByKey #Key={0}, #IDCliente={1}", new object[] { "", "47" }).ToList();
List<Attachment> AllegatiPerCategorie = new List<Attachment>();
AllegatiPerCategorie = AllegatiPerCategorie.Union(db.AttachmentAttachmentCategories.Where(aac => aac.IDAttachmentCategory == 72).OrderBy(p => p.Ordine == null ? 1 : 0).ThenBy(p => p.Ordine).Select(aac => aac.Attachment)).ToList();
Allegati = Allegati.Intersect(AllegatiPerCategorie).ToList();
count = 0;
foreach (Attachment a in AllegatiPerCategorie)
{
Response.Write(count.ToString() + " - " + a.IDAttachment + "<br />");
count++;
}
Response.Write("<br />### FILTERED ###<br /><br />");
count = 0;
foreach (Attachment a in Allegati)
{
Response.Write(count.ToString() + " - " + a.IDAttachment + "<br />");
count++;
}
And the output is :
0 - 6769
1 - 6792
2 - 6771
3 - 6699
4 - 6632
5 - 6774
6 - 6595
7 - 6602
8 - 6641
9 - 6643
10 - 6764
11 - 6634
12 - 6642
13 - 6660
14 - 6640
15 - 6665
16 - 6673
17 - 6767
18 - 6772
19 - 6766
20 - 6763
21 - 6768
22 - 6644
23 - 6635
24 - 6633
25 - 6793
26 - 6677
27 - 6608
28 - 6610
29 - 6558
30 - 6563
31 - 6631
32 - 6604
33 - 6606
34 - 6607
35 - 6596
36 - 6597
37 - 6598
38 - 6599
39 - 6600
40 - 6471
41 - 6470
42 - 6469
43 - 6601
44 - 6603
45 - 6663
46 - 6664
47 - 6645
48 - 6637
49 - 6638
50 - 6609
51 - 6611
52 - 6612
53 - 6613
54 - 6614
55 - 6615
56 - 6616
57 - 6617
58 - 6618
59 - 6619
60 - 6620
61 - 6622
62 - 6567
63 - 6568
64 - 6569
65 - 6570
66 - 6571
67 - 6572
68 - 6573
69 - 6575
70 - 6576
71 - 6577
72 - 6579
73 - 6580
74 - 6581
75 - 6582
76 - 6583
77 - 6584
78 - 6585
79 - 6586
80 - 6587
81 - 6588
82 - 6589
83 - 6590
84 - 6591
85 - 6592
86 - 6593
87 - 6594
88 - 6765
### FILTERED ###
0 - 6769
1 - 6792
2 - 6771
3 - 6699
4 - 6774
5 - 6595
6 - 6602
7 - 6634
8 - 6642
9 - 6640
10 - 6660
11 - 6665
12 - 6673
13 - 6772
14 - 6766
15 - 6768
16 - 6644
17 - 6635
18 - 6633
19 - 6793
20 - 6677
Well, notice for example the order of values 6660 and 6640 in the AllegatiPerCategorie list : 6660 before 6640 (at position 13 and 14).
Now, watch at the same values order on Allegati : 6640 is before 6660 (at position 9 and 10).
Why this behaviour? How can I fix it? Thank you
MSDN states:
When the object returned by this method is enumerated, Union enumerates first and second in that order and yields each element that has not already been yielded.
Here is a short example to demonstrate the behavior:
new int[] {1}.Union(new int[] {1, 2, 3}) // returns: 1,2,3
new int[] {2}.Union(new int[] {1, 2, 3}) // returns: 2,1,3
new int[] {3}.Union(new int[] {1, 2, 3}) // returns: 3,1,2
new int[] {1,3,5}.Union(new int[] {2, 4}) // returns: 1,3,5,2,4
Union:
Produces the set union of two sequences by using the default equality comparer.
A set by definition contains no duplicates and has no inherent sorting.
Also:
This method excludes duplicates from the return set. This is different behavior to the Concat<TSource> method, which returns all the elements in the input sequences including duplicates.
And:
When the object returned by this method is enumerated, Union enumerates first and second in that order and yields each element that has not already been yielded.
I have a situation and I donĀ“t understand. The case is very simple. I use the a generic repository to work my DB. http://efgenericrepository.codeplex.com/ All work very well, but now just 1 view I get a problem. I think the EF return a word data when I execute a query.
this is my SQL result in SQL Manager:
Select *
from Vw_HoursMOPJustificated
where IdUser = 20
and ActionDate = '2012-08-22' and Hour < 24
IdMopTime | IdJustification | IdJustificationType
44 30 8
44 40 11
44 43 13
45 31 8
45 41 12
46 32 8
And this is my result inside C# when I execute this simple code.
MyIGFEntities entity = new MyIGFEntities();
var table = new Repository<MyIGF.Models.Vw_HoursMOPJustificated>(new MyIGFEntities())
.Find(x => x.ActionDate == ActionDate && x.IdUser == IdUser && x.Hour < 24);
IdMopTime | IdJustification | IdJustificationType
44 | 30 | 8
44 | 30 | 8
44 | 30 | 8
45 | 31 | 8
45 | 31 | 8
46 | 32 | 8
Anybody can Help me?
You must correct your edmx (quite sure you have one).
On your Vw_HoursMOPJustificated entity and set Primary Key to true for IdMopTime, IdJustification and IdJustificationType (at least).
To check if everything is correct, try to get data from your edmx, and see if you have correct distinct data.
Sometimes (and mainly with views, which don't have a "real" primary key in db), the primary keys (or the properties which makes each row distinct) are badly retrieved, and you get this kind of confusing results.
Suggestion either in C# or VB.NET are welcome.
Table relationship:
Student 1:N TimeSheet (FK StudentId)
TimeSheet 1:N TimeRecord (FK TimeSheetId)
Dim query = From s In db.Students _
Let pair = (From ts In db.TimeSheets _
Join tr In db.TimeRecords On tr.TimeSheetId Equals ts.TimeSheetId _
Where ts.IsArchive = False And ts.IsCompleted = False _
Group By key = New With {ts.TimeSheetId, ts.StudentId} Into TotalHour = Sum(tr.BonusHour)) _
From part In pair _
Where part.key.StudentId = s.StudentId _
Select New With {.StudentId = s.StudentId, .AssignedId = s.AssignedId,.TotalTime = part.TotalHour}
Here's the result of the query:
734 -- 159 : 9 hrs 35 mm 28 sec
2655 -- 160 : 93 hrs 33 mm 50 sec
1566 -- 161 : 37 hrs 23 mm 53 sec
3114 -- 162 : 25 hrs 0 mm 21 sec
Wanted result of query:
733 -- 158 : 0 hr 0mm 0 sec
734 -- 159 : 9 hrs 35 mm 28 sec
736 -- 169 : 0 hrs 0mm 0sec
2655 -- 160 : 93 hrs 33 mm 50 sec
1566 -- 161 : 37 hrs 23 mm 53 sec
3114 -- 162 : 25 hrs 0 mm 21 sec
2165 -- 189 : 0 hr 0 mm 21 sec
There are some TimeSheet that have no TimeRecord, which I need to select as well. How can I select all of them to make selection like above wanted result? I'm thinking of how I can include some condtion
checking in the query to see if this TimeSheet has no TimeRecord then no need to Sum(tr.BonusHour) just assign TotalHour to zero. I don't know it's right way to go.
Any sugestion is welcome.
You can try doing something like this with the Sum (C#):
Sum(tr.BonusHour ?? 0)
which would be the same as
Sum(tr.BonusHour != null ? tr.BonusHour : 0)
I am not sure what type your BonusHour has, so you would use zero-correspondent object of this type instead of the 0 in the sample.