convert rows to columns in Access - c#

I have read many question on Stack Overflow related to my problem, but I don't think they quite address my problem. Basically I download a XML dataset with lots of data, and inserted that data into my MS Access database. What I want to do is convert the data so that some specific rows become columns.
Now I can probably do this manually in code before inserting the data to database, but that would require lots of time and change in code, so I'm wondering if its possible to do this with MS Access.
Here's how my table basically looks, and how I want to convert it.
The index is not so relevant in my case
[Table1] => [Table1_converted]
[Index] [Name] [Data] [NameID] [NameID] [AA] [BB] [CC] [DD]
1 AA 14 1 1 14 date1 64 61
2 BB(date) 42 1 2 15+19 date2 67+21 63+12
3 CC 64 1 3 9 10
4 DD 61 1 4 date4 1 87
5 AA 15 2
6 BB(date) 35 2
7 CC 67 2
8 DD 63 2
9 AA 9 3
10 CC 10 3
11 AA 19 2
12 BB(date) 20 2
13 CC 21 2
14 DD 12 2
15 BB(date) 83 4
16 CC 1 4
17 DD 87 4
Forgot to mention that, the Values under the column [Name] are not really AA BB CC.
They are more complex then that. AA is actually like "01 - NameAA", without the quotation mark.
Forgot to mention one important element in my question, if the [Name] ex. AA with same [NameID] exists in table, then the [Data] should SUM up those two values. I have edited the tables, on the converted table i have written ex. 15+19 or 35+20 which only illustrates which values are summed up.
One more edit, hopefully the last. One of the [Name] BB has a Datetime type in [Data].
The NameID can be whichever, does not matter. So i need a query which does an exception on [Name] BB when its summing up, so that it does not sum it up like it does to every other [Name]s [Data]. Places where date is written multiple times for same [Name] and [NameID], it is always the same.

To accomplish this in Access, all you need to do is
TRANSFORM Sum([Data]) AS SumOfData
SELECT [NameID]
FROM [Table1]
GROUP BY [NameID]
PIVOT [Name]
edit re: revised question
To handle some [Name]s differently we would need to assemble the results (Sum()s, etc.) first, and then crosstab the results
For test data in [Table1]:
Index Name Data NameID
----- ---- ---------- ------
1 AA 14 1
2 BB 2013-12-01 1
3 CC 64 1
4 DD 61 1
5 AA 15 2
6 BB 2013-12-02 2
7 CC 67 2
8 DD 63 2
9 AA 9 3
10 CC 10 3
11 AA 19 2
12 BB 2013-12-02 2
13 CC 21 2
14 DD 12 2
15 BB 2013-12-04 4
16 CC 1 4
17 DD 87 4
the query
TRANSFORM First(columnData) AS whatever
SELECT [NameID]
FROM
(
SELECT [NameID], [Name], Sum([Data]) AS columnData
FROM [Table1]
WHERE [Name] <> 'BB'
GROUP BY [NameID], [Name]
UNION ALL
SELECT DISTINCT [NameID], [Name], [Data]
FROM [Table1]
WHERE [Name] = 'BB'
)
GROUP BY [NameID]
PIVOT [Name]
produces
NameID AA BB CC DD
------ -- ---------- -- --
1 14 2013-12-01 64 61
2 34 2013-12-02 88 75
3 9 10
4 2013-12-04 1 87

Try this...in sql query may be it is your answer
SELECT NameID , [AA] as AA,[BB] as BB,[CC] as CC,[DD] as DD
FROM
(
SELECT Name,Data,NameID FROM Table1
)PivotData
PIVOT
(
max(Data) for Name in ([AA],[BB],[CC],[DD])
) AS Pivoting

I think you need to this
1) Take all your Table1 as it is in SQL Server
2) Then run following query
DECLARE #cols AS NVARCHAR(MAX),
#query AS NVARCHAR(MAX)
select #cols = STUFF((SELECT distinct ',' + QUOTENAME(Name)
from [Table1]
FOR XML PATH(''), TYPE
).value('.', 'NVARCHAR(MAX)')
,1,1,'')
set #query = 'SELECT countryid,' + #cols + '
from
(
select NameID, Name
from Table1 cc
) T
pivot
(
max (Name)
for languagename in (' + #cols + ')
) p '
execute sp_executesql #query;

DECLARE #Table1 TABLE ([Index] INT,[Name] CHAR(2),[Data] INT,[NameID] INT)
INSERT INTO #Table1
VALUES
(1,'AA',14,1),
(2,'BB',42,1),
(3,'CC',64,1),
(4,'DD',61,1),
(5,'AA',15,2),
(6,'BB',35,2),
(7,'CC',67,2),
(8,'DD',63,2),
(9,'AA',9,3),
(10,'CC',10,3),
(11,'BB',83,4),
(12,'CC',1,4),
(13,'DD',87,4)
SELECT [NameID] , ISNULL([AA], '') AS [AA], ISNULL([BB], '') AS [BB]
, ISNULL([CC], '') AS [CC], ISNULL([DD], '') AS [DD]
FROM
(
SELECT NAME, DATA, NAMEID
FROM #Table1
)q
PIVOT
(
SUM(DATA)
FOR NAME
IN ([AA], [BB], [CC], [DD])
)P
Result Set
NameID AA BB CC DD
1 14 42 64 61
2 15 35 67 63
3 9 10
4 83 1 87

Related

Calculating change in column over groups and extracting based on criteria

I am a beginner to coding in U-SQL/C#. I am stuck in a place during windowing/aggregation.
My Data looks like
Name Date OrderNo Type Balance
one 2018-06-25T04:55:44.0020987Z 1 Drink 15
one 2018-06-25T04:57:44.0020987Z 1 Drink 70
one 2018-06-25T04:59:44.0020987Z 1 Drink 33
one 2018-06-25T04:59:49.0020987Z 1 Drink 25
two 2018-06-25T04:55:44.0020987Z 2 Drink 22
two 2018-06-25T04:57:44.0020987Z 2 Drink 81
two 2018-06-25T04:58:44.0020987Z 2 Drink 33
two 2018-06-25T04:59:44.0020987Z 2 Drink 45
In U-SQL I am adding a unique id based on combinations of name, orderno and type and for the purpose of sorting, I am adding another one including the date.
#files =
EXTRACT
name string,
date DateTime,
type string,
orderno int,
balance int
FROM
#InputFile
USING new JsonExtractor();
#files2 =
SELECT *,
DENSE_RANK() OVER(ORDER BY name,type,orderno,date) AS group_id,
DENSE_RANK() OVER(ORDER BY name,type,orderno) AS id
FROM #files;
My Data now looks like this:
Name Date OrderNo Type Balance group_id id
one 2018-06-25T04:55:44.0020987Z 1 Drink 15 1 1
one 2018-06-25T04:57:44.0020987Z 1 Drink 70 2 1
one 2018-06-25T04:59:44.0020987Z 1 Drink 33 3 1
one 2018-06-25T04:59:49.0020987Z 1 Drink 25 4 1
two 2018-06-25T04:55:44.0020987Z 2 Drink 22 5 2
two 2018-06-25T04:57:44.0020987Z 2 Drink 81 6 2
two 2018-06-25T04:58:44.0020987Z 2 Drink 33 7 2
two 2018-06-25T04:59:44.0020987Z 2 Drink 45 8 2
(I have added only 4 records per group but there are multiple per group)
I am stuck at determining the difference between successive rows in the balance column in each group.
Expected Output for Part 1:
Name Date OrderNo Type Balance group_id id increase
one 2018-06-25T04:55:44.0020987Z 1 Drink 15 1 1 0
one 2018-06-25T04:57:44.0020987Z 1 Drink 70 2 1 55
one 2018-06-25T04:59:44.0020987Z 1 Drink 33 3 1 -37
one 2018-06-25T04:59:49.0020987Z 1 Drink 25 4 1 -8
two 2018-06-25T04:55:44.0020987Z 2 Drink 22 5 2 0
two 2018-06-25T04:57:44.0020987Z 2 Drink 81 6 2 59
two 2018-06-25T04:58:44.0020987Z 2 Drink 33 7 2 -48
two 2018-06-25T04:59:44.0020987Z 2 Drink 45 8 2 8
For every new group (defined by id) the increase should start from zero.
I went through stack overflow and saw the lag function from transgresql. I could not find a C# equivalent. Is that applicable in this case?
Any help is appreciated. Further clarification will be provided if required.
Update: When I use CASE WHEN my solution looks like this
CURRENT OUTPUT DESIRED OUTPUT
id Balance Increase id Balance Increase
1 15 0 1 15 0
1 70 55 1 70 55
1 33 -37 1 33 -37
1 25 -8 1 25 -8
2 22 "-3" 2 22 "0"
2 81 59 2 81 59
2 33 -48 2 33 -48
2 45 12 2 45 12
Look at the highlighted row. The increase column must start at 0 for each id.
Update: I was able to solve the first part of my question. See my answer below.
The second part that I had posted earlier was incorrectly posted. I have removed that.
You can try to use LAG window function get previous Balance in a subquery, then use where write the condition.
SELECT * FROM (
SELECT *,
DENSE_RANK() OVER(ORDER BY name,type,orderno,date) AS group_id,
DENSE_RANK() OVER(ORDER BY name,type,orderno) AS id,
(CASE WHEN LAG(Balance) OVER(ORDER BY name,type,orderno) IS NULL THEN 0
ELSE Balance - LAG(Balance) OVER(ORDER BY name,type,orderno)
END) as increase
FROM #files
) t1
WHERE increase > 50
The query that finally worked for me was this..
#files =
EXTRACT
name string,
date DateTime,
type string,
orderno int,
balance int
FROM
#InputFile
USING new JsonExtractor();
#files2 =
SELECT *,
DENSE_RANK() OVER(ORDER BY name,type,orderno) AS group_id
FROM #files;
#files3 =
SELECT *,
DENSE_RANK() OVER(PARTITION BY group_id ORDER BY date) AS group_order
FROM #files2;
#files4 =
SELECT *,
(CASE WHEN group_order == 1 THEN 0
ELSE balance - LAG(balance) OVER(ORDER BY name,type,orderno)
END) AS increase
FROM #files3;

Update all records using a function sql

I am looking to update a calculated sum in sql
Basically I have a table:
ImportID SeiralNumber Day Hour value Difference Complete
1 123 1 1 6 NULL 0
2 123 1 2 8 NULL 0
3 123 1 5 21 NULL 0
4 123 1 6 28 NULL 0
5 222 2 2 12 NULL 0
6 222 2 5 18 NULL 0
7 222 2 4 16 NULL 0
8 222 1 12 8 NULL 0
For each serial number there will be a day 1-365 and hour through 1-12, all I want to do is calculate the difference filed from the record before
So take ImportID 6, I need to get the record which is on the same day and the hour before (importID 7) then I need to update the Difference using the value field which is 18 -17 = 1.
N.B. There may be gaps in the sequence and if there is no previous record then the difference should stay as NULL. Once they have been calculated they need to be inserted into a new table only when the difference is now not null and it doesn't exist in the table already, on a successful insert they get marked as complete. Also a record before can be a previous day (day 1 hour 12) is the record before (day 2, hour 1)
Currently I am using a loop to select the null values, get the previous record, update the record, if its OK insert into other table, update the Completed field.
My issue is that this is working on a million records and it is taking a long while to Select the applicable records (completed = 0) into a temp table and loop through each.
Is there any quicker way to mass process these as an update statement? Or separate statements?
The result should be
ImportID SeiralNumber Day Hour value Difference Complete
1 123 1 1 6 NULL 0
2 123 1 2 8 2 1
3 123 1 5 21 NULL 0
4 123 1 6 28 7 1
5 222 2 1 12 4 1
6 222 2 5 18 2 1
7 222 2 4 16 NULL 0
8 222 1 12 8 NULL 0
Thanks in advance
I think this is basically it isn't it?
DECLARE #TABLE TABLE
(
ImportId INT,
SerialNumber INT,
Day INT,
Hour INT,
Value INT,
Difference INT,
Complete INT
)
INSERT INTO #TABLE VALUES
(1,123,1,1,6,NULL,0),
(2,123,1,2,8,NULL,0),
(3,123,1,5,21,NULL,0),
(4,123,1,6,28,NULL,0),
(5,222,2,1,12,NULL,0),
(6,222,2,5,18,NULL,0),
(7,222,2,4,16,NULL,0),
(8,222,1,12,8,NULL,0)
SELECT * FROM #Table
UPDATE T
SET T.Difference = T.Value - TT.Value,
Complete = 1
FROM
(
SELECT *
,ROW_NUMBER() OVER (PARTITION BY SerialNumber ORDER BY Day ASC, Hour ASC) AS RowCounter
FROM #TABLE
WHERE Complete = 0 --Ignore completed ones
)AS T
INNER JOIN
(
SELECT *
,ROW_NUMBER() OVER (PARTITION BY SerialNumber ORDER BY Day ASC, Hour ASC) AS RowCounter
FROM #TABLE
)AS TT
ON T.SerialNumber = TT.SerialNumber
WHERE
(
T.RowCounter = TT.RowCounter + 1
AND
T.Day = TT.Day
AND
T.Hour = TT.Hour + 1
)
OR
(
T.Day = TT.Day + 1
AND
T.Hour = 1
AND
TT.Hour = 12
)
SELECT * FROM #TABLE

Oracle timeframe conversion

I have the following configuration: a table called source (bid, valid_from, valid_to, qty) and destination (bid, jan, feb, mar, apr, ...., dec).
I would like to insert the data from source into origin, (spliting the qty equally to the valid_from-valid_to months and the remaining rest to the last month in the timeframe) like this : if the record in source is
00001 01.02.2001 31.06.2001 132
this would be translated into destination :
00001 0 26 26 26 26 28 0 0 0 0 0 0
How could I do this?
Thank you!
I'm assuming valid_from/valid_to never spans multiple years; i.e., TO_DATE(valid_from,'YYYY') always = TO_DATE(valid_to,'YYYY').
SQL> CREATE TABLE source (
2 bid VARCHAR2(5)
3 , valid_from DATE
4 , valid_to DATE
5 , qty NUMBER
6 );
Table created.
SQL> INSERT INTO source VALUES ('00001',TO_DATE('20010201','YYYYMMDD'),TO_DATE('20010630','YYYYMMDD'),132);
1 row created.
SQL> INSERT INTO source VALUES ('00002',TO_DATE('20020301','YYYYMMDD'),TO_DATE('20021231','YYYYMMDD'),59);
1 row created.
SQL> CREATE TABLE destination (
2 bid VARCHAR2(5)
3 , jan NUMBER
4 , feb NUMBER
5 , mar NUMBER
6 , apr NUMBER
7 , may NUMBER
8 , jun NUMBER
9 , jul NUMBER
10 , aug NUMBER
11 , sep NUMBER
12 , oct NUMBER
13 , nov NUMBER
14 , dec NUMBER
15 );
Table created.
SQL> COLUMN jan FORMAT 999
SQL> COLUMN feb FORMAT 999
SQL> COLUMN mar FORMAT 999
SQL> COLUMN apr FORMAT 999
SQL> COLUMN may FORMAT 999
SQL> COLUMN jun FORMAT 999
SQL> COLUMN jul FORMAT 999
SQL> COLUMN aug FORMAT 999
SQL> COLUMN sep FORMAT 999
SQL> COLUMN oct FORMAT 999
SQL> COLUMN nov FORMAT 999
SQL> COLUMN dec FORMAT 999
SQL> INSERT INTO destination
2 SELECT bid
3 , NVL(MAX(DECODE(r,01,split_qty)),0) jan
4 , NVL(MAX(DECODE(r,02,split_qty)),0) feb
5 , NVL(MAX(DECODE(r,03,split_qty)),0) mar
6 , NVL(MAX(DECODE(r,04,split_qty)),0) apr
7 , NVL(MAX(DECODE(r,05,split_qty)),0) may
8 , NVL(MAX(DECODE(r,06,split_qty)),0) jun
9 , NVL(MAX(DECODE(r,07,split_qty)),0) jul
10 , NVL(MAX(DECODE(r,08,split_qty)),0) aug
11 , NVL(MAX(DECODE(r,09,split_qty)),0) sep
12 , NVL(MAX(DECODE(r,10,split_qty)),0) oct
13 , NVL(MAX(DECODE(r,11,split_qty)),0) nov
14 , NVL(MAX(DECODE(r,12,split_qty)),0) dec
15 FROM
16 (
17 SELECT x.bid
18 , x.month_abbr
19 , x.r
20 , x.rn
21 , x.total_months
22 , x.qty
23 , FLOOR(x.qty / x.total_months)
24 + DECODE(x.rn
25 , x.total_months, MOD(x.qty, x.total_months)
26 , 0) split_qty
27 FROM (SELECT s.bid
28 , months.r
29 , ROW_NUMBER()
30 OVER (PARTITION BY s.bid
31 ORDER BY months.r) rn
32 , COUNT(*)
33 OVER (PARTITION BY s.bid) total_months
34 , s.qty
35 FROM (SELECT ROWNUM r
36 FROM DUAL
37 CONNECT BY LEVEL <= 12) months
38 , source s
39 WHERE TO_CHAR(s.valid_from,'YYYY') = TO_CHAR(s.valid_to,'YYYY')
40 AND months.r BETWEEN TO_NUMBER(TO_CHAR(s.valid_from,'MM'))
41 AND TO_NUMBER(TO_CHAR(s.valid_to,'MM'))) x
42 )
43 GROUP BY bid
44 ;
2 rows created.
SQL> SELECT *
2 FROM destination
3 ;
BID JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC
----- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ----
00001 0 26 26 26 26 28 0 0 0 0 0 0
00002 0 0 5 5 5 5 5 5 5 5 5 14
SQL>

Selecting the (number of) columns dyanamically using linq to sql

I have a table like this
Student Exam p1 p2 p3 p4 p5 p6 p7 p8 p9 p10 p11 p12
-----------------------------------------------------------------------------------
100 unit1 89 56 59 28 48 38 0 0 0 0 0 0
100 unit2 89 56 59 0 0 0 0 0 0 0 0 0
100 unit3 89 56 59 28 48 38 0 0 0 0 0 0
100 unit4 89 56 59 28 48 0 0 0 0 0 0 0
another table
Exam Num_subjects
----------------------
unit1 6
unit2 3
unit3 6
unit4 5
now i need to select the only first 8 columns in the marks table for unit1 as the number of subject for the unit1 is 6 .. how to do this dynamically ...
exam is foreign key to the marks table in linq to sql any ideas ...
If you have a column based design, since L2S doesn't let you manually materialize (i.e. new MyTable { Foo = row.Foo /* omit some } you are a bit scuppered.
If you just want the data, you could use something like "dapper" which won't have this issue, but you'll need to write the TSQL yourself, i.e.
var rows = conn.Query<MyTable>("select p1, p2, p3, p4, p5 from MyTable where Exam=#exam",
new { exam }).ToList();
But ultimately, I think I'd prefer a different db schema here...
there would be no need for a dynamic query, if those tables were normalized. (check your design)
if you really want to do this dynamically, you'll need an expression tree that handles the select part of your query ... here you can find some more details about a dynamic query lib that can handle that expression tree generation for you (you can provide a string like "new(p1,p2,p3)" and that gets translated to an expression tree)
Here you don't need to use linq, you can do it with logic
Now just get Num_Subjects like for unit1 = 6
DataTable dt = [whole_table];
int counter = Num_Subjects + 1; //7
string colName = "P" + counter.ToString(); //P7
while(dt.Columns.Contains(colName))
{
dt.Columns.Remove(colName);
colName = "P" + (++counter).ToString()
}
At last you we get a table upto P6 columns rest of columns will be deleted.

LINQ to SQL sum null value

I have the following query, I'd like to sum the NULL value also. Some TimeSheet don't records in TimeRecord and some tr.TimeIn and tr.TimeOut are NULL.
The query select only TimeSheet that has reords in TimeRecord. How I can have it select everything, and sum up the NULL value as well. So, the SUM of NULL will be just zero.
Table relationship:
Student 1:N TimeSheet (FK StudentId)
TimeSheet 1:N TimeRecord (FK TimeSheetId)
TimeIn and TimeOut are DateTime type and nullable.
Query 1: Monthy Report:
Dim query = From ts In db.TimeSheets _
Join tr In db.TimeRecords On tr.TimeSheetId Equals ts.TimeSheetId _
Where ts.IsArchive = False And ts.IsCompleted = False And tr.TimeOut IsNot Nothing _
Group By key = New With {ts.Student, .MonthYear = (tr.TimeOut.Value.Month & "/" & tr.TimeOut.Value.Year)} Into TotalHour = Sum(DateDiffSecond(tr.TimeIn, tr.TimeOut)) _
Select key.Student.StudentId, key.Student.AssignedId, key.MonthYear, TotalHour
Query 2: Total TimeRecord for Student with Active TimeSheet:
Dim query = From ts In db.TimeSheets _
Join tr In db.TimeRecords On tr.TimeSheetId Equals ts.TimeSheetId _
Where ts.IsArchive = False And ts.IsCompleted = False _
Group By ts.StudentId, tr.TimeSheetId Into TotalTime = Sum(DateDiffSecond(tr.TimeIn, tr.TimeOut)) _
Select StudentId, TimeSheetId, TotalTime
Here's the result of the query 2:
734 -- 159 : 9 hrs 35 mm 28 sec
2655 -- 160 : 93 hrs 33 mm 50 sec
1566 -- 161 : 37 hrs 23 mm 53 sec
3114 -- 162 : 25 hrs 0 mm 21 sec
Wanted result of Query 2:
733 -- 158 : 0 hr 0mm 0 sec
734 -- 159 : 9 hrs 35 mm 28 sec
736 -- 169 : 0 hrs 0mm 0sec
2655 -- 160 : 93 hrs 33 mm 50 sec
1566 -- 161 : 37 hrs 23 mm 53 sec
3114 -- 162 : 25 hrs 0 mm 21 sec
Same for Query 1 but it makes monthly report.
I apologise because I translated your query to C# before tweaking it, and I don’t really know the VB syntax well enough to translate it back, but I hope that you will be able to. I tried the following query and it does what you asked for:
var query = from st in Students
select new
{
st.StudentId,
st.AssignedId,
TotalHour = (
from ts in TimeSheets
where ts.StudentId == st.StudentId
join tr in TimeRecords on ts.TimeSheetId equals tr.TimeSheetId
where !ts.IsArchive && !ts.IsCompleted && tr.TimeOut != null
select (tr.TimeOut.Value - tr.TimeIn).TotalHours
).Sum()
};
I had to remove the MonthYear thing because I didn’t really understand how that fit in with your grouping, but since it’s not in the output, I suspected that maybe you don’t need it.
I had to make a few assumptions:
I am assuming that TimeOut is a DateTime? (nullable) while TimeIn is DateTime (non-nullable). I think that makes sense.
I am assuming that TimeSheets have a StudentId that links them to students.

Categories

Resources