Algorithm to get short path between given currency exchange rates - c#

I'm working on an integration process that requires the currency conversion between a list of values in specific currency to a concrete given currency.
For this process will exists 2 files, one containing the exchange rates and other containing the prices with the origin currency.
The exchange rates files looks like this:
Text:USDtoEUR;Origin:USD;Destination:EUR;Value:0.7
Text:EURtoCAD;Origin:EUR;Destination:CAD;Value:0.5
The file containing the prices with the origin currency (and also the target currency) looks like this:
Index:0;TargetCurrency:CAD
Index:1;Description:Product1;Value:150;Currency:EUR
Index:2;Description:Product2;Value:3;Currency:USD
For this specific case there is no direct way to convert from USD to CAD, so I need to first convert it to another currency present in the file that has CAD exchange rate (EUR) and then convert it to CAD.
This is a very basic scenario, but I'm guessing those files can contain more complex ones, where maybe it's required to convert 2 or 3 times before reaching the target currency.
What I'm planning to do is to insert the content of the exchange rates file into a SQL Server table and then start a very manual process of looking records containing the target currency... but I've never faced this scenario and don't know if this could be an acceptable approach in terms of speed/performance, that's why I'm wondering if there is a standard algorithm or data structure best suited for this process.
I will appreciate your help

If you need to take the currency conversion rate into consideration to find an optimal conversion path, you would use Bellman-Ford Algorithm .
This link may help you.
But, if the performance of the conversion is matters, you need to use an algorithm to find the shortest path between two nodes (visiting fewer nodes, regardless of the conversion cost) like BFS or DFS
(means traversing the tree to find the shortest path between two nodes(two currencies).

Related

OutputBuffer not working for large c# list

I'm currently using SSIS to do an improvement on a project. need to insert single documents in a MongoDB collection of type Time Series. At some point I want to retrieve rows of data after going through a C# transformation script. I did this:
foreach (BsonDocument bson in listBson)
{
OutputBuffer.AddRow();
OutputBuffer.DatalineX = (string) bson.GetValue("data");
}
But this piece of code that works great with small file does not work with a 6 million line file. That is, there are no lines in the output. The other following tasks validate but react as if they had received nothing as input.
Where could the problem come from?
Your OuputBuffer has DatalineX defined as a string, either DT_STR or DT_WSTR and a specific length. When you exceed that value, things go bad. In normal strings, you'd have a maximum length of 8k or 4k respectively.
Neither of which are useful for your use case of at least 6M characters. To handle that, you'll need to change your data type to DT_TEXT/DT_NTEXT Those data types do not require a length as they are "max" types. There are lots of things to be aware of when using the LOB types.
Performance can suck depending on whether SSIS can keep the data in memory (good) or has to write intermediate values to disk (bad)
You can't readily manipulate them in a data flow
You'll use a different syntax in a Script Component to work with them
e.g.
// TODO: convert to bytes
Output0Buffer.DatalineX.AddBlobData(bytes);
Longer example of questionable accuracy with regard to encoding the bytes that you get to solve at https://stackoverflow.com/a/74902194/181965

Conversion to data format for crfsharp ...

I have a a review data set of about 250000 reviews of hotels, I'm planing to extract aspects from it using crfsharp dll, however the data that I have is in normal text paragraph form and I need to convert it into the format of crfsharp so I can train and test data to extract aspects. Well can someone tell me what will be the best way to do that, I was thinking of writing a small program for data format conversion.
Another thing I was wondering whether can CRF sharp do aspect extraction using crf models it has? I'm using c#.
What's features and tags will you use in your task ?
There is a simplest example. For a sentence "! Tokyo and New York are major financial centers." If you want to extract location name from it and your only feature is token string, you can generate training corpus as belows:
! NOR
Tokyo LOCATION
and NOR
New LOCATION
York LOCATION
are NOR
major NOR
financial NOR
centers NOR
. NOR
The first column is the term of the sentence, the second column is the corresponding tags. NOR means normal term, LOCATION means location name. You can generate training corpus as above format and use CRFSharp to train a model.
For more complex example, such as more features, template, adding word position in tags, you can refer another example in CRFSharp home page(http://crfsharp.codeplex.com).

Convert UTM to decimal degrees with Esri's ArcGis with C#

I need to convert a user's UTM input (WGS 1984) into Decimal Degrees, preferably using ESRI's ArcGis. I've already got the code to retrieve the zone (formatted like 14N, 22S, etc.) and the easting and northing factors. What do I do from here?
Edit: we expect the input as a string like: 14N 423113mE 4192417mN. I can easily extract the numbers (and a character) 14, N, 423113, and 4192417 from the string above. I just need to somehow translate that to Decimal Degrees.
There is no specific information about input data.
Here is some general info to start from:
The easiest way is to use Geoprocessing engine to reproject the whole feature class. Use C# class for Project tool from Data Management toolbox.
Another way is to use Project method of IGeometry is you want project only several features.
EDIT: for your input data use solution 2.
One more easier way is to use .NET port of open-source library Proj.4 - Proj4Net. For such simple task it is much more easier to use than ArcObjects classes.

Compress a short but repeating string

I'm working on a web app that needs to take a list of files on a query string (specifically a GET and not a POST), something like:
http://site.com/app?things=/stuff/things/item123,/stuff/things/item456,/stuff/things/item789
I want to shorten that string:
http://site.com/app?things=somekindofencoding
The string isn't terribly long, varies from 20-150 chars. Something that short isn't really suitable for GZip, but it does have an awful lot of repetition so compression should be possible.
I don't want a DB or Dictionary of strings - the URL will be built by a different application to the one that consumes it. I want a reversible compression that shortens this URL. It doesn't need to be secure.
Is there an existing way to do this? I'm working in C#/.Net but would be happy to adapt an algorithm from some other language/stack.
If you can express the data in BNF you could contruct a parser for the data. in stead of sending the data you could send the AST where each node would be identified as one character (or several if you have a lot of different nodes). In your example
we could have
files : file files
|
file : path id
path : itemsthing
| filesitem
| stuffthingsitem
you could the represent a list of files as path[id1,id2,...,idn] using 0,1,2 for the paths and the input being:
/stuff/things/item123,/stuff/things/item456,/stuff/things/item789
/files/item1,/files/item46,/files/item7
you'd then end up with ?things=2[123,456,789]1[1,46,7]
where /stuff/things/item is represented with 2 and /files/item/ is represented with 1 each number within [...] is an id. so 2[123] would expand to /stuff/things/item123
EDIT The approach does not have to be static. If you have to discover the repeated items dynamically you can use the same approach and pass the map between identifier and token. in that case the above example would be
?things=2[123,456,789]1[1,46,7]&tokens=2=/stuff/things/,1=/files/item
which if the grammar is this simple ofcourse would do better with
?things=/stuff/things/[123,456,789]/files/item[1,46,7]
compressing the repeated part to less than the unique value with such a short string is possible but will most likely have to be based on constraining the possible values or risk actually increasing the size when "compressing"
You can try zlib using raw deflate (no zlib or gzip headers and trailers). It will generally provide some compression even on short strings that are composed of printable characters and does look for and take advantage of repeated strings. I haven't tried it, but could also see if smaz works for your data.
I would recommend obtaining a large set of real-life example URLs to use for benchmark testing of possible compression approaches.

Best Way to Load a File, Manipulate the Data, and Write a New File

I have an issue where I need to load a fixed-length file. Process some of the fields, generate a few others, and finally output a new file. The difficult part is that the file is of part numbers and some of the products are superceded by other products (which can also be superceded). What I need to do is follow the superceded trail to get information I need to replace some of the fields in the row I am looking at. So how can I best handle about 200000 lines from a file and the need to move up and down within the given products? I thought about using a collection to hold the data or a dataset, but I just don't think this is the right way. Here is an example of what I am trying to do:
Before
Part Number List Price Description Superceding Part Number
0913982 3852943
3852943 0006710 CARRIER,BEARING
After
Part Number List Price Description Superceding Part Number
0913982 0006710 CARRIER,BEARING 3852943
3852943 0006710 CARRIER,BEARING
As usual any help would be appreciated, thanks.
Wade
Create structure of given fields.
Read file and put structures in collection. You may use part number as key for hashtable to provide fastest searching.
Scan collection and fix the data.
200 000 objects from given lines will fit easily in memory.
For example.
If your structure size is 50 bytes then you will need only 10Mb of memory. It is nothing for modern PC.

Categories

Resources