I am a beginner to coding in U-SQL/C#. I am stuck in a place during windowing/aggregation.
My Data looks like
Name Date OrderNo Type Balance
one 2018-06-25T04:55:44.0020987Z 1 Drink 15
one 2018-06-25T04:57:44.0020987Z 1 Drink 70
one 2018-06-25T04:59:44.0020987Z 1 Drink 33
one 2018-06-25T04:59:49.0020987Z 1 Drink 25
two 2018-06-25T04:55:44.0020987Z 2 Drink 22
two 2018-06-25T04:57:44.0020987Z 2 Drink 81
two 2018-06-25T04:58:44.0020987Z 2 Drink 33
two 2018-06-25T04:59:44.0020987Z 2 Drink 45
In U-SQL I am adding a unique id based on combinations of name, orderno and type and for the purpose of sorting, I am adding another one including the date.
#files =
EXTRACT
name string,
date DateTime,
type string,
orderno int,
balance int
FROM
#InputFile
USING new JsonExtractor();
#files2 =
SELECT *,
DENSE_RANK() OVER(ORDER BY name,type,orderno,date) AS group_id,
DENSE_RANK() OVER(ORDER BY name,type,orderno) AS id
FROM #files;
My Data now looks like this:
Name Date OrderNo Type Balance group_id id
one 2018-06-25T04:55:44.0020987Z 1 Drink 15 1 1
one 2018-06-25T04:57:44.0020987Z 1 Drink 70 2 1
one 2018-06-25T04:59:44.0020987Z 1 Drink 33 3 1
one 2018-06-25T04:59:49.0020987Z 1 Drink 25 4 1
two 2018-06-25T04:55:44.0020987Z 2 Drink 22 5 2
two 2018-06-25T04:57:44.0020987Z 2 Drink 81 6 2
two 2018-06-25T04:58:44.0020987Z 2 Drink 33 7 2
two 2018-06-25T04:59:44.0020987Z 2 Drink 45 8 2
(I have added only 4 records per group but there are multiple per group)
I am stuck at determining the difference between successive rows in the balance column in each group.
Expected Output for Part 1:
Name Date OrderNo Type Balance group_id id increase
one 2018-06-25T04:55:44.0020987Z 1 Drink 15 1 1 0
one 2018-06-25T04:57:44.0020987Z 1 Drink 70 2 1 55
one 2018-06-25T04:59:44.0020987Z 1 Drink 33 3 1 -37
one 2018-06-25T04:59:49.0020987Z 1 Drink 25 4 1 -8
two 2018-06-25T04:55:44.0020987Z 2 Drink 22 5 2 0
two 2018-06-25T04:57:44.0020987Z 2 Drink 81 6 2 59
two 2018-06-25T04:58:44.0020987Z 2 Drink 33 7 2 -48
two 2018-06-25T04:59:44.0020987Z 2 Drink 45 8 2 8
For every new group (defined by id) the increase should start from zero.
I went through stack overflow and saw the lag function from transgresql. I could not find a C# equivalent. Is that applicable in this case?
Any help is appreciated. Further clarification will be provided if required.
Update: When I use CASE WHEN my solution looks like this
CURRENT OUTPUT DESIRED OUTPUT
id Balance Increase id Balance Increase
1 15 0 1 15 0
1 70 55 1 70 55
1 33 -37 1 33 -37
1 25 -8 1 25 -8
2 22 "-3" 2 22 "0"
2 81 59 2 81 59
2 33 -48 2 33 -48
2 45 12 2 45 12
Look at the highlighted row. The increase column must start at 0 for each id.
Update: I was able to solve the first part of my question. See my answer below.
The second part that I had posted earlier was incorrectly posted. I have removed that.
You can try to use LAG window function get previous Balance in a subquery, then use where write the condition.
SELECT * FROM (
SELECT *,
DENSE_RANK() OVER(ORDER BY name,type,orderno,date) AS group_id,
DENSE_RANK() OVER(ORDER BY name,type,orderno) AS id,
(CASE WHEN LAG(Balance) OVER(ORDER BY name,type,orderno) IS NULL THEN 0
ELSE Balance - LAG(Balance) OVER(ORDER BY name,type,orderno)
END) as increase
FROM #files
) t1
WHERE increase > 50
The query that finally worked for me was this..
#files =
EXTRACT
name string,
date DateTime,
type string,
orderno int,
balance int
FROM
#InputFile
USING new JsonExtractor();
#files2 =
SELECT *,
DENSE_RANK() OVER(ORDER BY name,type,orderno) AS group_id
FROM #files;
#files3 =
SELECT *,
DENSE_RANK() OVER(PARTITION BY group_id ORDER BY date) AS group_order
FROM #files2;
#files4 =
SELECT *,
(CASE WHEN group_order == 1 THEN 0
ELSE balance - LAG(balance) OVER(ORDER BY name,type,orderno)
END) AS increase
FROM #files3;
If a user selects 07-07-2016, I want to count how many times billdays > 30 for that billdate month, for the past 12 months.
Query is this:
select COUNT(*) as 'BillsOver30' from pssuite_web.pujhaccd
where billdays>30 and DATEDIFF(month,'07-07-2016', GETDATE()) <= 13
Group By Month(billdate)
Result is this:
1784 (July)
1509 (June)
2986 (May)
2196 (etc)
5853
3994
1753
1954
869
1932
629
1673
LINQ query is:
DateStart = '07-07-2016' (from a textbox on view)
DateTime earliestDate = objdate1.DateStart.Value.AddMonths(-13);
var custQuery9 = (from m in DataContext.pujhaccds
where m.billdays > 30 &&
m.billdate >= earliestDate &&
m.billdate <= objdate1.DateStart
group m by m.billdate.Value.Month into p
select new Date1()
{
theMonth = p.Key,
theCount = p.Count()
});
Results are:
Month Count
1 1029
2 1018
3 1972
4 1519
5 2657
6 2019
7 1206
8 1023
9 761
10 1620
11 354
12 931
You can see that the LINQ query is way off.
Doesn't matter what date I put in, the result always stays the same. Rather than 12 months from starting date.
I must be writing it wrong, thanks.
I need to make a query that select all rows belong to a ID from the datatable where Lesson from that ID has two specific values. For example ID 2 and 3 have lessons D and E that I want to take all the rows of ID 2 and ID 3.
My data is in MS Access.
MY Programming language is C#.
I already try to write some a query
"SELECT * FROM MainData WHERE ID IN ( SELECT ID FROM MainData GROUP BY ID HAVING (Lesson ='" + txtbox_startsubject.Text + "' and '" + TargetPoint + "'))";
The format of data:
ID Lesson Time Score
1 C 165 4
1 E 190 3
1 H 195 3
1 I 200 4
2 A 100 2
2 B 150 5
2 D 210 2
2 R 10 3
2 E 110 4
3 D 130 5
3 E 190 5
3 H 210 4
3 I 160 4
3 J 110 4
4 E 120 3
4 H 150 4
4 J 170 4
I have read many question on Stack Overflow related to my problem, but I don't think they quite address my problem. Basically I download a XML dataset with lots of data, and inserted that data into my MS Access database. What I want to do is convert the data so that some specific rows become columns.
Now I can probably do this manually in code before inserting the data to database, but that would require lots of time and change in code, so I'm wondering if its possible to do this with MS Access.
Here's how my table basically looks, and how I want to convert it.
The index is not so relevant in my case
[Table1] => [Table1_converted]
[Index] [Name] [Data] [NameID] [NameID] [AA] [BB] [CC] [DD]
1 AA 14 1 1 14 date1 64 61
2 BB(date) 42 1 2 15+19 date2 67+21 63+12
3 CC 64 1 3 9 10
4 DD 61 1 4 date4 1 87
5 AA 15 2
6 BB(date) 35 2
7 CC 67 2
8 DD 63 2
9 AA 9 3
10 CC 10 3
11 AA 19 2
12 BB(date) 20 2
13 CC 21 2
14 DD 12 2
15 BB(date) 83 4
16 CC 1 4
17 DD 87 4
Forgot to mention that, the Values under the column [Name] are not really AA BB CC.
They are more complex then that. AA is actually like "01 - NameAA", without the quotation mark.
Forgot to mention one important element in my question, if the [Name] ex. AA with same [NameID] exists in table, then the [Data] should SUM up those two values. I have edited the tables, on the converted table i have written ex. 15+19 or 35+20 which only illustrates which values are summed up.
One more edit, hopefully the last. One of the [Name] BB has a Datetime type in [Data].
The NameID can be whichever, does not matter. So i need a query which does an exception on [Name] BB when its summing up, so that it does not sum it up like it does to every other [Name]s [Data]. Places where date is written multiple times for same [Name] and [NameID], it is always the same.
To accomplish this in Access, all you need to do is
TRANSFORM Sum([Data]) AS SumOfData
SELECT [NameID]
FROM [Table1]
GROUP BY [NameID]
PIVOT [Name]
edit re: revised question
To handle some [Name]s differently we would need to assemble the results (Sum()s, etc.) first, and then crosstab the results
For test data in [Table1]:
Index Name Data NameID
----- ---- ---------- ------
1 AA 14 1
2 BB 2013-12-01 1
3 CC 64 1
4 DD 61 1
5 AA 15 2
6 BB 2013-12-02 2
7 CC 67 2
8 DD 63 2
9 AA 9 3
10 CC 10 3
11 AA 19 2
12 BB 2013-12-02 2
13 CC 21 2
14 DD 12 2
15 BB 2013-12-04 4
16 CC 1 4
17 DD 87 4
the query
TRANSFORM First(columnData) AS whatever
SELECT [NameID]
FROM
(
SELECT [NameID], [Name], Sum([Data]) AS columnData
FROM [Table1]
WHERE [Name] <> 'BB'
GROUP BY [NameID], [Name]
UNION ALL
SELECT DISTINCT [NameID], [Name], [Data]
FROM [Table1]
WHERE [Name] = 'BB'
)
GROUP BY [NameID]
PIVOT [Name]
produces
NameID AA BB CC DD
------ -- ---------- -- --
1 14 2013-12-01 64 61
2 34 2013-12-02 88 75
3 9 10
4 2013-12-04 1 87
Try this...in sql query may be it is your answer
SELECT NameID , [AA] as AA,[BB] as BB,[CC] as CC,[DD] as DD
FROM
(
SELECT Name,Data,NameID FROM Table1
)PivotData
PIVOT
(
max(Data) for Name in ([AA],[BB],[CC],[DD])
) AS Pivoting
I think you need to this
1) Take all your Table1 as it is in SQL Server
2) Then run following query
DECLARE #cols AS NVARCHAR(MAX),
#query AS NVARCHAR(MAX)
select #cols = STUFF((SELECT distinct ',' + QUOTENAME(Name)
from [Table1]
FOR XML PATH(''), TYPE
).value('.', 'NVARCHAR(MAX)')
,1,1,'')
set #query = 'SELECT countryid,' + #cols + '
from
(
select NameID, Name
from Table1 cc
) T
pivot
(
max (Name)
for languagename in (' + #cols + ')
) p '
execute sp_executesql #query;
DECLARE #Table1 TABLE ([Index] INT,[Name] CHAR(2),[Data] INT,[NameID] INT)
INSERT INTO #Table1
VALUES
(1,'AA',14,1),
(2,'BB',42,1),
(3,'CC',64,1),
(4,'DD',61,1),
(5,'AA',15,2),
(6,'BB',35,2),
(7,'CC',67,2),
(8,'DD',63,2),
(9,'AA',9,3),
(10,'CC',10,3),
(11,'BB',83,4),
(12,'CC',1,4),
(13,'DD',87,4)
SELECT [NameID] , ISNULL([AA], '') AS [AA], ISNULL([BB], '') AS [BB]
, ISNULL([CC], '') AS [CC], ISNULL([DD], '') AS [DD]
FROM
(
SELECT NAME, DATA, NAMEID
FROM #Table1
)q
PIVOT
(
SUM(DATA)
FOR NAME
IN ([AA], [BB], [CC], [DD])
)P
Result Set
NameID AA BB CC DD
1 14 42 64 61
2 15 35 67 63
3 9 10
4 83 1 87
I need to produce some SQL that will show me the trend (up or down tick) in some transacitons.
Consider this table with a PlayerId and a Score
PlayerId, Score, Date
1,10,3/13
1,11,3/14
1,12,3/15
If I pull data from 3/15 I have a score of 12 with an upward trend compared to the historical data.
I did something similar in Oracle 8i about 10 years ago using some of the analytical functions like rank, however it was 10 years ago....
The results would look similar to
PlayerId, Score, Date, Trend
1,12,3/15,UP
How can I do something similar with sql azure?
This SQL:
with data as (
select * from ( values
(1,11,cast('2013/03/12' as smalldatetime)),
(1,15,cast('2013/03/13' as smalldatetime)),
(1,11,cast('2013/03/14' as smalldatetime)),
(1,12,cast('2013/03/15' as smalldatetime))
) data(PlayerId,Score,[Date])
)
select
this.*,
Prev = isnull(prev.Score,0),
tick = case when this.Score > isnull(prev.Score,0) then 'Up' else 'Down' end
from data this
left join data prev
on prev.PlayerId = this.PlayerId
and prev.[Date] = this.[Date] - 1
returns this output:
PlayerId Score Date Prev tick
----------- ----------- ----------------------- ----------- ----
1 11 2013-03-12 00:00:00 0 Up
1 15 2013-03-13 00:00:00 11 Up
1 11 2013-03-14 00:00:00 15 Down
1 12 2013-03-15 00:00:00 11 Up