DynamoDB Data Modeling - Hierarchical Data Structures as items - c#

The access patterns that I'm interested in is the last item for a given exchange and an account name.
+-------------------------------+-------------------------------------------+---------+--------------------------------------------+------------+----------+------------+---------+-----------+----------------+--------------------------------------------------------------------+---------------------------+----------------------------------+--------------+------------------------------+
| PK | SK | Account | Address | AddressTag | Exchange | Instrument | Network | Quantity | TransactionFee | TransactionId | TransferDate | TransferId | TransferType | UpdatedAt |
+-------------------------------+-------------------------------------------+---------+--------------------------------------------+------------+----------+------------+---------+-----------+----------------+--------------------------------------------------------------------+---------------------------+----------------------------------+--------------+------------------------------+
| Exchange#Binance#Account#main | TransferDate#12/17/2022 4:59:12 PM +02:00 | main | 0xF76d3f20bF155681b0b983bFC3ea5fe43A2A6E3c | null | Binance | USDT | ETH | 97.500139 | 3.2 | 0x46d28f7d0e1e5b1d074a65dcfbb9d90b3bcdc7e6fca6b1f1f7abb5ab219feb24 | 2022-12-17T16:59:12+02:00 | 1b56485f6a3446c3b883f4f485039260 | 0 | 2023-01-28T20:19:59.9181573Z |
| Exchange#Binance#Account#main | TransferDate#12/17/2022 5:38:23 PM +02:00 | main | 0xF76d3f20bF155681b0b983bFC3ea5fe43A2A6E3c | null | Binance | USDT | ETH | 3107.4889 | 3.2 | 0xbb2b92030b988a0184ba02e2e754b7a7f0f963c496c4e3473509c6fe6b54a41d | 2022-12-17T17:38:23+02:00 | 4747f6ecc74f4dd8a4b565e0f15bcf79 | 0 | 2023-01-28T20:20:00.4536839Z |
| Exchange#FTX#Account#main | TransferDate#12/17/2021 5:38:23 PM +02:00 | main | 0x476d3f20bF155681b0b983bFC3ea5fe43A2A6E3c | null | FTX | USDT | ETH | 20 | 3.2 | 0xaa2b92030b988a0184ba02e2e754b7a7f0f963c496c4e3473509c6fe6b54a41d | 2021-12-17T17:38:23+02:00 | 4747f6ecc74f4dd8a4b565e0f15bcf79 | 0 | 2023-01-28T20:20:00.5723855Z |
| Exchange#FTX#Account#main | TransferDate#12/19/2022 4:59:12 PM +02:00 | main | 0xc46d3f20bF155681b0b983bFC3ea5fe43A2A6E3c | null | FTX | USDT | ETH | 15 | 3.2 | 0xddd28f7d0e1e5b1d074a65dcfbb9d90b3bcdc7e6fca6b1f1f7abb5ab219feb24 | 2022-12-19T16:59:12+02:00 | 1b56485f6a3446c3b883f4f485039260 | 0 | 2023-01-28T20:20:00.5207119Z |
+-------------------------------+-------------------------------------------+---------+--------------------------------------------+------------+----------+------------+---------+-----------+----------------+--------------------------------------------------------------------+---------------------------+----------------------------------+--------------+------------------------------+
First of all, it seems to be working as expected but as I'm still learning I'm not so sure whether the partition key and the sort key I chose are good enough or not. This is important as "Uneven distribution of data due to the wrong choice of partition key" can cause reading/writing above the limit issues.
There was a similar example in the documentation and what they say about TransactionId being a partition key is as following:
In most cases you won’t use TransactionID for any query purposes, so you lose the ability to use the partition key to perform a fast lookup of data. To expand this reasoning, consider the traditional order history view on an e-commerce site. Normally orders are retrieved by customer ID or Order ID, not a UID such as a transaction ID that was synthetically generated during checkout. It’s better to choose a natural partition key than generate a synthetic one that won’t be used for querying.
Another interesting part of the documentation is about the composite key
Composite sort keys let you define hierarchical (one-to-many) relationships in your data that you can query at any level of the hierarchy
[country]#[region]#[state]#[county]#[city]#[neighborhood]
This would let you make efficient range queries for a list of locations at any one of these levels of aggregation, from country, to a neighborhood, and everything in between.
I'm also interested in the "Get all user transfers by date range" access pattern but I'm not sure how I could achieve it. So here we are.
C# implementation
public async Task<UserTransferDto?> GetLastAsync(string exchange, string account)
{
var queryRequest = new QueryRequest
{
TableName = TableName,
KeyConditionExpression = "#pk = :pk",
ExpressionAttributeNames = new Dictionary<string, string>
{
{ "#pk", "PK" }
},
ExpressionAttributeValues = new Dictionary<string, AttributeValue>
{
{ ":pk", new AttributeValue { S = $"Exchange#{exchange}#Account#{account}" } }
},
ScanIndexForward = false,
Limit = 1
};
var response = await _dynamoDb.QueryAsync(queryRequest);
if (response.Items.Count == 0)
{
return null;
}
var itemAsDocument = Document.FromAttributeMap(response.Items[0]);
return JsonSerializer.Deserialize<UserTransferDto>(itemAsDocument.ToJson());;
}
public class UserTransferDto
{
[JsonPropertyName("PK")]
public string Pk => $"Exchange#{Exchange}#Account#{Account}";
[JsonPropertyName("SK")]
public string Sk => $"TransferDate#{TransferDate}";
public required string Exchange { get; init; }
public required string Account { get; init; }
public required DateTimeOffset TransferDate { get; init; }
public required string TransferId { get; init; }
public required TransferType TransferType { get; init; }
public required string Instrument { get; init; }
public required string Network { get; init; }
public required decimal Quantity { get; init; }
public required string Address { get; init; }
public string? AddressTag { get; init; }
public decimal? TransactionFee { get; init; }
public string? TransactionId { get; init; }
public DateTime UpdatedAt { get; set; }
}
public enum TransferType
{
Withdraw = 0,
Deposit = 1
}
Sources:
https://youtu.be/HaEPXoXVf2k?t=720
https://youtu.be/HaEPXoXVf2k?t=798
Hierarchical Data Structures as Items https://youtu.be/HaEPXoXVf2k?t=2775
Access Patterns https://youtu.be/HaEPXoXVf2k?t=2903
https://aws.amazon.com/blogs/database/choosing-the-right-dynamodb-partition-key/
https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/bp-sort-keys.html

Your base table design works well for getting the latest item for a given exchange and account (via a Query with that as the PK and getting the last sortable from the SK), except that you’re using non-sortable human time stamps instead of sortable time stamps. You should use 2023-01-28 12:56:08 and so on so that the times sort right as strings.
For the other query to find the latest across all exchanges and accounts, you can create a GSI which has a singular PK and the times as the SK. Just beware that you’re limited in how many writes per second you can do to the same PK. Above 1,000 write units per second you’ll need to shard it and then do a query for each shard to get the latest per shard and then the latest overall.
This is a pattern described in https://youtu.be/0iGR8GnIItQ

Related

Populate object from dataset when having list in list C#

I have this data available from the database:
|----------------|--------------|----------------|----------------|----------------|----------------|
| entity_name | provider_id | provider_name | product_id | product_name | country_name |
|----------------|--------------|----------------|----------------|----------------|----------------|
| test | 123 | Provider1 | 1 | Product1 | Russia |
|----------------|--------------|----------------|----------------|----------------|----------------|
| test | 123 | Provider1 | 2 | Product2 | Spain |
|----------------|--------------|----------------|----------------|----------------|----------------|
| test | 123 | Provider1 | 3 | Product3 | France |
|----------------|--------------|----------------|----------------|----------------|----------------|
| test | 456 | Provider2 | 3 | Product3 | France |
|----------------|--------------|----------------|----------------|----------------|----------------|
| test | 123 | Provider1 | 4 | Product4 | France |
And I have to map it to this model in C#:
public class EntityModel
{
public string EntityName { get; set; }
public List<ProviderModel> Providers { get; set; }
}
public class ProviderModel
{
public int ProviderID { get; set; }
public string ProviderName { get; set; }
public List<ProductModel> Products { get; set; }
public ProviderModel() { }
}
public class ProductModel
{
public int ProductID { get; set; }
public string ProductName { get; set; }
public string CountryName { get; set; }
public ProductModel() { }
}
So basically I have to group by providers, for every provider I have to show the products:
{
"entityName": "string",
"providers": [
{
"ProviderID": 0,
"ProviderName": "string",
"products": [
{
"productId": 0,
"productName": "string",
"countryName": "string"
}
]
}
]
}
The sp that returns the data is like this:
SELECT DISTINCT
e.entity_name
,pro.provider_id
,pro.provider_name
,p.product_id
,p.product_name
,pc.country_name
FROM provider_product_table ppt
INNER JOIN product p ON ppt.product_id = p.product_id
INNER JOIN product_parent pp on pp.product_parent_id=p.product_parent_id
INNER JOIN provider pro ON pro.provider_id = ppt.provider_id
INNER JOIN product_country pc on pc.product_id=p.product_id
INNER JOIN entity e on e.product_parent_id=pp.product_parent_id
WHERE p.product_parent_id = #product_parent_id
ORDER BY p.product_id ASC
I tried a lot of groupBy versions but I get stuck at mapping the second list, the products one.
How can I achieve this? Thank you !
let's start with saying that your database is not well-structured or not created for your need, it will be more than good if you try to seprate your data into three table (entity,provider,product) with relationship between entity and provider , and relationship between your provider and the product table.
however, maybe you are developing new feature that it was not thinking at first to make your code or even your database more extensible.
in this case i need to see your group query request (the groupby ) that you created already.
meanwhile i can imagine your need, so you have several solution.
the first one is what are you trying to do , create several query that fill your classes ,i can also imagine with Redundancy, that because there are some id who are unfortunately duplicated due to your database structure, in this case you need to use DISTINCT or INTERSECT (even if its not your need here ) .
seconds one you can deal with temporary table using the keyword INTO#tmpTable if you are using sqlserver, so basically is to crate a new tables in memory to perform your query ( those tables is what i propose above ).
try to edit your question with those query you describe.

LINQ GroupBy for multiple Column

I am stuck in logic to group by. I have a model that has all information but I have to group information according to CustomerBuildingMapping
public class TicketsDataModel
{
public string BuildingID { get; set; }
public string Ticket { get; set; }
public string Amount { get; set; }
public string CustomerID { get; set; }
public string BuildingName { get; set; }
}
Current Data
BuildingID | Ticket | Amount |CustomerID | BuildingName
10 | 001 | 50 | 1 | JP Building
11 | 002 | 45 | 1 | Tiskon
52 | 452 | 35 | 2 | Lalit
65 | 568 | 78 | 2 | Tuilp
41 | 121 | 12 | 1 | BK Trp
-
public class CustomerBuildingMapping
{
public long LeadID { get; set; }
public string CustomerID { get; set; }
public List<BuildingInfo> BuildingInfo{ get; set; }
}
public class BuildingInfo
{
public string BuildingID { get; set; }
public string TicketNumber { get; set; }
public long Amount { get; set; }
public string BuildingName { get; set; }
}
Expected Data after group by
LeadID 1001
CustomerID 1
BuildingInfo
BuildingID | Ticket | Amount | BuildingName
10 | 001 | 50 | JP Building
11 | 002 | 45 | Tiskon
41 | 121 | 12 | BK Trp
LeadID 1002
CustomerID 2
BuildingInfo
BuildingID | Ticket | Amount | BuildingName
52 | 452 | 35 | Lalit
65 | 568 | 78 | Tulip
I have written this code but not able to group by for multiple columns.
List<CustomerBuildingMapping> objCustomerBuildingMappingResult = objTicketsForTheDayInfo.TicketsForTheDay.GroupBy(l => l.CustomerID).Select(grp => new CustomerBuildingMapping
{
CustomerID = grp.Key,
//BuildingInfo = grp.Select(l => l.BuildingID).ToList(),
}).ToList();
You do not need to group by multiple columns. Based on sample data you are only grouping by one field, CustomerID.
var objCustomerBuildingMappingResult = objTicketsForTheDayInfo.TicketsForTheDay
.GroupBy(l => l.CustomerID)
.Select(grp => new CustomerBuildingMapping
{
CustomerID = grp.Key,
LeadId = long.Parse(grp.Key) + 1000,
BuildingInfo = grp.Select(l => new BuildingInfo {
BuildingID = l.BuildingID,
TicketNumber = l.Ticket,
Amount = l.Amount,
BuildingName = l.BuildingName
}).ToList(),
}).ToList();
As Nkosi has pointed out, in OP's example, because LeadId is an automatically generated surrogate key, there's no need to GroupBy on anything other than a single key field (CustomerID), and make a function to generate the surrogate LeadId.
However, in the more general case, if a composite key needs to be constructed for a GroupBy, then both types of Tuples (System.Tuple on older versions of C#, and System.ValueTuple in C#7 and later) make for good transient grouping keys when used with .GroupBy and .ToDictionary. This is because Tuples internally build the tuple instance's HashCode by combining the underlying Hashcodes of the contained types.
For the older System.Tuple, you need to deal with the ugly Itemx properties:
var objCustomerBuildingMappingResult = objTicketsForTheDayInfo
.TicketsForTheDay
.GroupBy(l => Tuple.Create(l.CustomerID, l.BuildingID))
.Select(grp => new CustomerBuildingMapping
{
CustomerID = grp.Key.Item1,
BuildingId = grp.Key.Item2,
// ...
})
.ToList();
But this is more readable (and performant) with System.ValueTuple:
var objCustomerBuildingMappingResult = objTicketsForTheDayInfo
.TicketsForTheDay
.GroupBy(l => (CustomerId: l.CustomerID, BuildingId: l.BuildingID))
.Select(grp => new CustomerBuildingMapping
{
CustomerID = grp.Key.CustomerId,
BuildingId = grp.Key.BuildingID,
// ...

Get data in user-defined type list from SQL Server

I have a SQL Server four tables 'mapping' table MappingACVP as:
Author_ID | CoAuthor_ID | Venue_ID | Paper_ID | Year
----------------------------------------------------
677 | 42700 | 64309 | 812229 | 2005
677 | 42700 | 64309 | 812486 | 2005
677 | 42700 | 64309 | 818273 | 2005
677 | 42700 | 65182 | 812229 | 2005
... | ... | ... | ... | ...
... | ... | ... | ... | ...
By using Entity Framework, I got auto-generated code for all four tables included in Entity Framework Model i.e. Authors.cs, CoAuthors_New.cs, Venues_New.cs and Papers_New.cs as well, whereas tables other than Authors all have Author_ID as foreign key as a relationship.
The auto-generated code for class Authors.cs is as:
public partial class Authors
{
public Authors()
{
this.CoAuthors_New = new HashSet<CoAuthors_New>();
this.Papers_New = new HashSet<Papers_New>();
this.Venues_New = new HashSet<Venues_New>();
}
public int id { get; set; }
public int Author_ID { get; set; }
public string Author_Name { get; set; }
public virtual ICollection<CoAuthors_New> CoAuthors_New { get; set; }
public virtual ICollection<Papers_New> Papers_New { get; set; }
public virtual ICollection<Venues_New> Venues_New { get; set; }
}
Now if I want to declare a list as:
List<Authors> _eAthors = new List<Authors>();
and want to fetch values for filling this list from database (using MappingACVP table), whereas there is no Year property in Authors.cs class. So,
How will I fill eAuthors list from database?

Entity / Code First / SQL Database - Same parameter but differents values

Context
I'm a beginner in Entity / CodeFirst and SQL database.
I would to build a table with the same parameter:
-----------------------------------------
| ID | F1 | F2 | F3 | F4 | F5 | F6 | F7 |
-----------------------------------------
| 1 |1000| 500| 250| 100| 50 | | |
| 2 | 500| 250| 125| | | | |
| 3 | 250| 125| 100| | | | |
| 4 | 200| 100| | | | | |
| 5 | 100| 50 | | | | | |
-----------------------------------------
Fs1 to 7 is a list the same parameter with different value.
Questions
What's the correct way to describe it in taking into Code First consideration?
I suppose this way isn't correct...
public class FsTable
{
public int Id { get; set; }
public double F1 { get; set; }
public double F2 { get; set; }
public double F3 { get; set; }
public double F4 { get; set; }
public double F5 { get; set; }
public double F6 { get; set; }
public double F7 { get; set; }
}
What's the correct way to fill this table with values above?
First of all think about colums without value. Should they be nullable or not? You can seed your Db from the Seed method.
If you are using console aplication you need to invoke some operation over the database so you can create it
Ps. Check out Repository Pattern

Linking txt file info with other file

I`m creating my first app,student grades book, and now i have a problem. I have a txt file full of student info(1 student for each line) and i need to link each student with their marks(they are in other file).What way do you suppose to do it? Maybe I need to make a structure of student name and 2d array?
P.S. Sorry for bad english
Here`s what i mean
Student name lastname
// ---------------------------------------------------------
// |Science |1 |2 |3 |4 |5 |6 |7 |8 |9 |10|11|12|13|14|15|
// ---------------------------------------------------------
// |Maths | | | | | | | | | | | | | | | |
// |English | | | | | | | | | | | | | | | |
1,2,3,4 and etc. is student grade index
A suggestion:
public class Student
{
public Student()
{
Marks = new List<StudentMark>();
}
public string Name { get; set; }
public List<StudentMark> Marks { get; set; }
public void Load(string line)
{
string[] parts = line.Split(' ');
Name = parts[0];
//Other properties, if any:
//LastName = parts[1];
}
}
public class StudentMark
{
public float Mark { get; set; }
public string Lesson { get; set; }
}
I am unsure as to how complex your program is, but would it be an option to create a Student object within which you would save all of the student's information including name and average grade? This would then allow you to save other relavant information. Otherwise, yes it would be an option to look into the dictionary class of the c# langugage. http://msdn.microsoft.com/en-us/library/xfhwa508.aspx
Do you have control of the generation of the separate student and score files? If so can you assign a unique id to each student and store both in each file? This would provide the link between the two. You could add the id as a member of #mahmoodvcs' solution. Then add logic to parse the data by id to get the relevant scores for each student.

Categories

Resources