Facebook fixed the /likes in Graph API. /likes now returns the complete list of user's that liked a particular object in the graph (Photos, Albums, etc). In before, it returns only 3 - 5 users.
My question is, how do you count the total number of "likes" without parsing the entire JSON and getting the element count? I'm only interested in the "likes" count; I'm not interested in the users who gave the likes.
It seems a little expensive to get the entire JSON dataset just to count.
EG: https://graph.facebook.com/161820597180936/likes
This photo has like 1,000+ likes.
Seeing as the string is JSON, why not convert it into a standard .net object, and use the .Count on the array that it creates. Then cache this information for 15 or more minutes (depending on stale you want your info).
The method above is quite heavy handed as you are essentially going to search a string an unknown number of times, to return an index, to compare it to an int, to add up another index and so on. And the C# above doesn't work (assuming it is C# that you are demoing).
Use something like this instead:
public static T FromJson<T>(this string s)
{
var ser = new System.Web.Script.Serialization.JavaScriptSerializer.JavaScriptSerializer();
return ser.Deserialize<T>(s);
}
where this method is an extension method, that takes properly formatted JSON string and converts it to the object T e.g.
var result = // call facebook here and get your response string
List<FacebookLikes> likes = result.FromJson <List<FacebookLikes>>();
Response.Write(likes.Count.ToString());
// now cache the likes somewhere, and get from cache next time.
Am not sure on the performance of this, not done any testing, but to me it looks a lot tidier and a lot more readable. And seeing as you are caching the data, then I'd go with the readable over the previous method.
Why is it expensive to parse the entire dataset? This should take milliseconds:
public static int CountLikes(string dataSet)
{
int i = 0;
int j = 0;
while ((i = dataSet.IndexOf('"id":', i)) != -1)
{
i += 5;
j++;
}
return j;
}
You can also append the parameter limit=# such as:
https://graph.facebook.com/161820597180936/likes?limit=1000
Related
I am Calling API to get a list of contacts(they might be in 100's or 1000's) and list only lists 100 at a time and its giving me this pagination option with an object at the end of the list called 'nextpage' and with URL to next 100 and so on..
so in my c# code and am getting first 100 and looping through them (to do something) and looking up for 'nextpage' object and getting the URL and re-calling the API etc.. looks like this next page chain goes on depending on how many ever contacts we have.
can you please let me know if there is a way for me to loop through same code and still be able to use new URL from 'nextpage' object and run the logic for every 100 i get ?
Pseudo-code, as we have no concrete examples to work with, but...
Most APIs with pagination will have a total count of items. You can set a max items per iteration and track it like that, or check for the null next_object, depending on how the API handles it.
List<ApiObject> GetObjects() {
const int ITERATION_COUNT = 100;
int objectsCount = GetAPICount();
var ApiObjects = new List<ApiObject>();
for (int i = 0; i < objectsCount; i+= ITERATION_COUNT) {
// get the next 100
var apiObjects = callToAPI(i, ITERATION_COUNT); // pass the current offset, request the max per call
ApiObjects.AddRange(apiObjects);
} // this loop will stop after you've reached objectsCount, so you should have all
return ApiObjects;
}
// alternatively:
List<ApiObject> GetObjects() {
var nextObject = null;
var ApiObjects = new List<ApiObject>();
// get the first batch
var apiObjects = callToAPI(null);
ApiObjects.AddRange(apiObjects);
nextObject = callResponse.nextObject;
// and continue to loop until there's none left
while (nextObject != null) {
var apiObjects = callToAPI(null);
ApiObjects.AddRange(apiObjects);
nextObject = callResponse.nextObject;
}
return apiObjects;
}
That's the basic idea anyway, per the two usual web service approaches (with lots of detail left out, as this is not working code but only meant to demonstrate the general approach).
I want to know how return values for strings works for strings in C#. In one of my functions, I generate html code and the string is really huge, I then return it from the function, and then insert it into the page. But I want to know should I pass a huge string as a return value, or just insert it into the page from the same function?
When C# returns a string, does it create a new string from the old one, and return that?
Thanks.
Strings (or any other reference type) are not copied when returning from a function, only value types are.
System.String is a reference type (class) and so passing as parameter and returning only involve the copying of a reference (32 or 64 bits).
The size of the string is not relevant.
Returning a string is a cheap operation - as mentioned it's purely a matter of returning 32 or 64 bits (4 or 8 bytes).
However, as Sten Petrov points out string + operations involve the creation of a new string, and can be a little expensive. If you wanted to save performance & memory I'd suggest doing something like this:
static int i = 0;
static void Main(string[] args)
{
while (Console.ReadLine() == "")
{
var pageSB = new StringBuilder();
foreach (var section in new[] { AddHeader(), AddContent(), AddFooter() })
for (int i = 0; i < section.Length; i++)
pageSB.Append(section[i]);
Console.Write(pageSB.ToString());
}
}
static StringBuilder AddHeader()
{
return new StringBuilder().Append("Hi ").AppendLine("World");
}
static StringBuilder AddContent()
{
return new StringBuilder()
.AppendFormat("This page has been viewed: {0} times\n", ++i);
}
static StringBuilder AddFooter()
{
return new StringBuilder().Append("Bye ").AppendLine("World");
}
Here we use the StringBuilders to hold a reference to all the strings we want to concat, and wait until the very end before joining them together. This'll save many unnecessary additions (which are memory and CPU heavy in comparison).
Of course, I doubt you'll actually see any need for this in practise - and if you do I'd spend some time learning about pooling etc. to help reduce the garbage created by all the string builders - and maybe consider creating a custom 'string holder' that suits your purposes better.
Basically I use Entity Framework to query a huge database. I want to return a string list then log it to a text file.
List<string> logFilePathFileName = new List<string>();
var query = from c in DBContext.MyTable where condition = something select c;
foreach (var result in query)
{
filePath = result.FilePath;
fileName = result.FileName;
string temp = filePath + "." + fileName;
logFilePathFileName.Add(temp);
if(logFilePathFileName.Count %1000 ==0)
Console.WriteLine(temp+"."+logFilePathFileName.Count);
}
However I got an exception when logFilePathFileName.Count=397000.
The exception is:
Exception of type 'System.OutOfMemoryException' was thrown.
A first chance exception of type 'System.OutOfMemoryException'
occurred in System.Data.Entity.dll
UPDATE:
What I want to use a different query say: select top 1000 then add to the list, but I don't know after 1000 then what?
Most probabbly it's not about a RAM as is, so increasing your RAM or even compiling and running your code in 64 bit machine will not have a positive effect, in this case.
I think it's related to a fact that .NET collections are limited to maximum 2GB RAM space (no difference either 32 or 64 bit).
To resolve this, split your list to much smaller chunks and most probabbly your problem will gone.
Just one possible solution:
foreach (var result in query)
{
....
if(logFilePathFileName.Count %1000 ==0) {
Console.WriteLine(temp+"."+logFilePathFileName.Count);
//WRITE SOMEWHERE YOU NEED
logFilePathFileName = new List<string>(); //RESET LIST !|
}
}
EDIT
If you want fragment a query, you can use Skip(...) and Take(...)
Just an explanatory example:
var fisrt1000 = query.Skip(0).Take(1000);
var second1000 = query.Skip(1000).Take(1000);
...
and so on..
Naturally put it in your iteration and parametrize it based on bounds of data you know or need.
Why are you collecting the data in a List<string> if all you need to do is write it to a text file?
You might as well just:
Open the text file;
Iterate over the records, appending each string to the text file (without storing the strings in memory);
Flush and close the text file.
You will need far less memory than now, because you won't be keeping all those strings unnecessarily in memory.
You probably need to set some vmargs for memory!
Also... look into writing it straight to your file and not holding it in a List
What Roy Dictus says sounds the best way.
Also you can try to add a limit to your query. So your database result won't be so large.
For info on:
Limiting query size with entity framework
You shouldn't read all records from database to list. It required a lot of memory. You an combine reading records and writing them to file. For example read 1000 records from db to list and save(append) them to text file, clear used memory (list.Clear()) and continue with new records.
From several other topics on StackOverflow I read that the Entity Framework is not designed to handle bulk data like that. The EF will cache/track all data in the context and will cause the exception in cases of huge bulks of data. Options are to use SQL directly or split up your records in smaller sets.
I used to use the gc arraylist in VS c++ similar to the gc List that you used, to works fin with small and intermediate data sets, but when using Big Dat, same problem 'System.OutOfMemoryException' was thrown.
As the size of these gcs cannot exceed 2 GB and therefore become inefficient with Big data, I built my own linked list, which gives the same functionality, dynamic increase and get by index, basically, it is a normal linked list class, with a dynamic array inside to provide getting data by index, it duplicates the space, but you may delete the linked list after updating the array is you do not need it keeping only the dynamic array, this would solve the problem. see the code:
struct LinkedNode
{
long data;
LinkedNode* next;
};
class LinkedList
{
public:
LinkedList();
~LinkedList();
LinkedNode* head;
long Count;
long * Data;
void add(long data);
void update();
//long get(long index);
};
LinkedList::LinkedList(){
this->Count = 0;
this->head = NULL;
}
LinkedList::~LinkedList(){
LinkedNode * temp;
while(head){
temp= this->head ;
head = head->next;
delete temp;
}
if (Data)
delete [] Data; Data=NULL;
}
void LinkedList::add (long data){
LinkedNode * node = new LinkedNode();
node->data = data;
node->next = this->head;
this->head = node;
this->Count++;}
void LinkedList::update(){
this->Data= new long[this->Count];
long i = 0;
LinkedNode * node =this->head;
while(node){
this->Data[i]=node->data;
node = node->next;
i++;
}
}
If you use this, please refer to my work https://www.liebertpub.com/doi/10.1089/big.2018.0064
I have csv file with 30 000 lines. I have to select many values based on many conditions, so insted of many loops and "if's" i decided to use linq. I have written class to read csv. It implements IEnumerable to be used with linq. This is my enumerator:
class CSVEnumerator : IEnumerator
{
private CSVReader _csv;
private int _index;
public CSVEnumerator(CSVReader csv)
{
_csv = csv;
_index = -1;
}
public void Reset(){_index = -1;}
public object Current
{
get
{
return new CSVRow(_index,_csv);
}
}
public bool MoveNext()
{
return ++_index < _csv.TotalRows;
}
}
It's working, but it's slow. Let's say i want to select max value in column A in range 100;150 row.
max = (from CSVRow r in csv where r.ID > 100 && r.ID < 150 select r).Max(y=>y["A"]);
This will work, but linq searches for max value in 30 000 rows instead of 48.
As I said, I could use loop, but only in this example case, conditions are "brutal" :)
Is there any way to override linq collection search. Something like: look into query used on my enumerator, look, if any linq conditions in "where" contains "row ID filter" and give another data based on this.
I don't want to copy part of data to another array/collection and problem is not in my csv reader. Accessing every row by id is fast, only problem is when you access all 30 000 of them.
Any help appriciated :-)
If you wanted to be able to use LINQ for this efficiently, you would need to use expression trees, in a similar (but much simpler) way than what various LINQ providers for SQL databases do. While doable, I think it would be quite a lot of code for such a simple task.
Because of that, I think a better solution would be to use a separate method to select the rows you want (and then possibly use LINQ to work with the result).
Also, many operations that return collections (including your original code and my modification) can be simplified by using iterator methods.
So, your code could look something like this:
public static IEnumerable<CSVRow> GetRows(
this CSVReader reader, int idGreaterThan, int idLessThan)
{
for (int i = idGreaterThan + 1; i < idLessThan; i++)
{
yield return new CSVRow(reader, i);
}
}
Here, it's an extension method for CSVReader, but another solution (e.g. actual method on that class) might be more appropriate for you.
Your example would then look something like:
max = csvReader.GetRows(100, 150).Max(y => y["A"]);
(Also, I find it weird that when you have limits 100 and 150, you actually want rows between 101 and 149. But I'm assuming you have a reason for that, so I did the same.)
As far as LINQ is concerned, r.ID is simply a value that is being filtered and so all 30k lines are considered for use in the Max operation. If this is a row index, which seems to be the case here, you can use Skip and Take to avoid comparing all 30k rows.
max = csv.Skip(100).Take(50).Max(y => y["A"]);
#DougM is right about the order of evaluation, but in this case what I would do is take a one time hit on initialization and generate lookups for any "index" fields: basically, pre calculate a map (dictionary) of row index to row. That said, this would only be useful if you have many repeated queries for a given index field.
I'm trying to figure out the best way to represent some data. It basically follows the form Manufacturer.Product.Attribute = Value. Something like:
Acme.*.MinimumPrice = 100
Acme.ProductA.MinimumPrice = 50
Acme.ProductB.MinimumPrice = 60
Acme.ProductC.DefaultColor = Blue
So the minimum price across all Acme products is 100 except in the case of product A and B. I want to store this data in C# and have some function where GetValue("Acme.ProductC.MinimumPrice") returns 100 but GetValue("Acme.ProductA.MinimumPrice") return 50.
I'm not sure how to best represent the data. Is there a clean way to code this in C#?
Edit: I may not have been clear. This is configuration data that needs to be stored in a text file then parsed and stored in memory in some way so that it can be retrieved like the examples I gave.
Write the text file exactly like this:
Acme.*.MinimumPrice = 100
Acme.ProductA.MinimumPrice = 50
Acme.ProductB.MinimumPrice = 60
Acme.ProductC.DefaultColor = Blue
Parse it into a path/value pair sequence:
foreach (var pair in File.ReadAllLines(configFileName)
.Select(l => l.Split('='))
.Select(a => new { Path = a[0], Value = a[1] }))
{
// do something with each pair.Path and pair.Value
}
Now, there two possible interpretations of what you want to do. The string Acme.*.MinimumPrice could mean that for any lookup where there is no specific override, such as Acme.Toadstool.MinimumPrice, we return 100 - even though there is nothing referring to Toadstool anywhere in the file. Or it could mean that it should only return 100 if there are other specific mentions of Toadstool in the file.
If it's the former, you could store the whole lot in a flat dictionary, and at look up time keep trying different variants of the key until you find something that matches.
If it's the latter, you need to build a data structure of all the names that actually occur in the path structure, to avoid returning values for ones that don't actually exist. This seems more reliable to me.
So going with the latter option, Acme.*.MinimumPrice is really saying "add this MinimumPrice value to any product that doesn't have its own specifically defined value". This means that you can basically process the pairs at parse time to eliminate all the asterisks, expanding it out into the equivalent of a completed version of the config file:
Acme.ProductA.MinimumPrice = 50
Acme.ProductB.MinimumPrice = 60
Acme.ProductC.DefaultColor = Blue
Acme.ProductC.MinimumPrice = 100
The nice thing about this is that you only need a flat dictionary as the final representation and you can just use TryGetValue or [] to look things up. The result may be a lot bigger, but it all depends how big your config file is.
You could store the information more minimally, but I'd go with something simple that works to start with, and give it a very simple API so that you can re-implement it later if it really turns out to be necessary. You may find (depending on the application) that making the look-up process more complicated is worse over all.
I'm not entirely sure what you're asking but it sounds like you're saying either.
I need a function that will return a fixed value, 100, for every product ID except for two cases: ProductA and ProductB
In that case you don't even need a data structure. A simple comparison function will do
int GetValue(string key) {
if ( key == "Acme.ProductA.MinimumPrice" ) { return 50; }
else if (key == "Acme.ProductB.MinimumPrice") { return 60; }
else { return 100; }
}
Or you could have been asking
I need a function that will return a value if already defined or 100 if it's not
In that case I would use a Dictionary<string,int>. For example
class DataBucket {
private Dictionary<string,int> _priceMap = new Dictionary<string,int>();
public DataBucket() {
_priceMap["Acme.ProductA.MinimumPrice"] = 50;
_priceMap["Acme.ProductB.MinimumPrice"] = 60;
}
public int GetValue(string key) {
int price = 0;
if ( !_priceMap.TryGetValue(key, out price)) {
price = 100;
}
return price;
}
}
One of the ways - you can create nested dictionary: Dictionary<string, Dictionary<string, Dictionary<string, object>>>. In your code you should split "Acme.ProductA.MinimumPrice" by dots and get or set a value to the dictionary corresponding to the splitted chunks.
Another way is using Linq2Xml: you can create XDocument with Acme as root node, products as children of the root and and attributes you can actually store as attributes on products or as children nodes. I prefer the second solution, but it would be slower if you have thousands of products.
I would take an OOP approach to this. The way that you explain it is all your Products are represented by objects, which is good. This seems like a good use of polymorphism.
I would have all products have a ProductBase which has a virtual property that defaults
virtual MinimumPrice { get { return 100; } }
And then your specific products, such as ProductA will override functionality:
override MinimumPrice { get { return 50; } }