My Web API, AAA, calls another API, BBB, to retrieve a large JSON array (~500-1000 KB and each object is 10 KB), it needs to parse the JSON array to apply a logic on it and forward the response to API CCC.
For optimization, I'd like that my Web API AAA doesn't have store the HTTP response containing the large JSON array, so the array doesn't have to be stored in the LOH (Large Object Heap).
I think a good idea to solve this issue is: instead of waiting for the full JSON array to be downloaded, is it possible to parse the elements of the response as it arrives so I can parse it, apply a logic on it and forward the content to my API CCC?
So my Web API never gets to store the large JSON array in memory. By parsing each object as it arrives, the object is so small that it will be stored in GEN 0 and gets collected really fast by GC.
What I tried so far:
My API BBB looks like this (simplified):
[HttpGet("{id}")]
public IActionResult Get(int id)
{
var text = System.IO.File.ReadAllText("C:\\Users\\John\\generated1000objects.json");
var deserialized = JsonConvert.DeserializeObject<object[]>(text);
return Ok(deserialized);
}
My code to query
var httpClient = new HttpClient();
using (var request = new HttpRequestMessage(HttpMethod.Get, "https://localhost:44328/api/values/4"))
using (var response = await httpClient.SendAsync(request, HttpCompletionOption.ResponseHeadersRead))
using (Stream stream = await response.Content.ReadAsStreamAsync())
using (StreamReader sr = new StreamReader(stream))
using (JsonReader reader = new JsonTextReader(sr))
{
reader.SupportMultipleContent = true;
while (true)
{
if (!reader.Read())
{
break;
}
JsonSerializer serializer = new JsonSerializer();
var deserialize = serializer.Deserialize<object>(reader);
Console.WriteLine(deserialize); // HERE it prints the whole JSON Array. I was expecting to deal with one object of the array
Console.WriteLine("#################");
}
}
My constraints:
I can't modify the API BBB that send the large JSON array.
My API CCC cannot directly call API BBB to retrieve the large JSON array
I'm on .NET Core with ASP.NET Core 2.2.
Looking at your solution, unless you are expecting this to grow in size substantially I believe that you might be suffering from trying to attempt a micro-optimization that will actually make your process more fragile than by simply processing in a regular manner.
You mention a record size of 10k, a response size of 500-1000k. This translates to a total of 50-100 records.
I believe that you will experience more difficulties in trying to parse the response in chunks than any impact of having an object on the Large Object Heap will provide. From what I can find in the various documentation, the ONLY way to parse a JSON document using a built-in library is to parse the whole document. Any chunking would need to be managed by yourself.
Related
I want my WebAPI action to return large data in chunks from the database without loading it all in memory and use a built-in JSON MediaTypeFormatter to serialize my list as a JSON object:
public IHttpActionResult GetLogs(CancellationToken cancellationToken)
{
var db = new DatabaseContext();
var logs = db.Logs.Take(9999999).AsNoTracking();
return Ok(logs);
}
Even though I am returning an IQueryable and not loading all the data to the memory, it seems like the JSON MediaTypeFormatter loads all the data into memory instead of streaming it straight to the response body in chunks, which causes large memory consumption. How can I stream large data from my action using MediaTypeFormatters without loading all the data in memory?
I have created a basic ASP.Net Web Application and I am trying to use the OpenWeatherMap API with this. (first time dealing with APIs).
The Info I have about the WebAPI is:
You can search weather forecast for 5 days with data every 3 hours by city name. All weather data can be obtained in JSON and XML formats.
There is a possibility to receive a central district of the city/town with its own parameters (geographic coordinates/id/name) in API response. Example
API call:
api.openweathermap.org/data/2.5/forecast?q={city name},{country code}
Parameters:
q city name and country code divided by comma, use ISO 3166 country codes
Examples of API calls:
api.openweathermap.org/data/2.5/forecast?q=London,us&mode=xml
Currently I have this working when I use the api that returns a json object
api.openweathermap.org/data/2.5/weather?q=London&units=metric
However if I simply change the URL to the first (which returns the XML) my application no longer retrieves the data from the API.
I have tried amended the mode=xml into mode=json but still no avail.
How can I use the first web API?
Many thanks
--Edit:
In my model class i have the following method:
string url = "api.openweathermap.org/data/2.5/…;
var client = new WebClient();
var content = client.DownloadString(url);
var serializer = new JavaScriptSerializer();
var jsonContent = serializer.Deserialize<Object>(content);
return jsonContent;
(taken out the key) I then call this method from my view. However I cannot use that api call that has the =xml at the end
your problem is when result is returned as xml, you are using a JavaScriptSerializer to Deserialize it.
xml is not json, hence the Deserialization would have failed
what you need is a XmlSerializer to deserializae the result
below is some code to get you started:
string url = #"http://samples.openweathermap.org/data/2.5/forecast?q=London&appid=b1b15e88fa797225412429c1c50c122a1&mode=xml";
var client = new WebClient();
var content = client.DownloadString(url);
XmlSerializer serializer = new XmlSerializer(typeof(weatherdata));
weatherdata result = null;
using (TextReader reader = new StringReader(content))
{
result = (weatherdata)serializer.Deserialize(reader);
}
notice that typeof weatherdata - it is no point of Deserialize into non strong type object if you are going to deserialize into object,
there is noting you can do with it.
if you don't want to hand code the strong type model, copy the xml result into clipboard then use VS (not sure other version but 2017 as example) toolbar
Edit -> paste special -> Paste xml as classes to generate the strong type class for you
I am trying to export some data to excel and store that excel in AWS S3. Our current architecture is like,
we get data from database and manipulate it as per our needs. This is done is one API call.
We need to pass that data as stream to another API ( specifically designed to interact with AWS S3)
Store that stream as Excel file in AWS S3.
So far what i have achieved is :
I am able to get data from database and convert it to memory stream. I have written another API to receive this stream. But couldn't manage to get to pass memory stream from one API to another API.
1st API :
public async Task<ICollection<UserDTO>> ExportUsers(Guid groupId, HttpRequestMessage request)
{
var ms = // get's memory stream out of data received from database.
var client = new HttpClient
{
BaseAddress = new Uri("http://localhost:58025/")
};
client.DefaultRequestHeaders.Accept.Clear();
client.DefaultRequestHeaders.Accept.Add(
new MediaTypeWithQualityHeaderValue("application/bson"));
MediaTypeFormatter bsonFormatter = new BsonMediaTypeFormatter();
//Not sure about BsonMediaTypeFormmater. Juz gave it a try
var response = await client.PostAsync("http://localhost:58025/Resource/Memory", ms, bsonFormatter);
}
2nd API :
[HttpPost]
[Route("Resource/Memory", Name = "UploadMemory")]
public async Task<IHttpActionResult> UploadMemoryFile(MemoryStream memory)
{
// Not reaching until here
}
Any help is highly appreciated!!
Our client side code works directly with elasticsearch responses, but I want to put NEST in the middle to do some security and filtering. What is the easiest way to build a query with NEST (or elasticsearch.net) and then just pass the raw json response back out to my client with the least amount of processing. I'm using ServiceStack as well by the way.
Previous similiar question has now an outdated answer - Returning Raw Json in ElasticSearch NEST query
Thanks
This is for the benefit of readers who want to achieve the same thing in newer versions of NEST, v2.3 as of this writing. If you just want the response, all you need to do is this using the ElasticLowLevelClient, according to the doc:
var responseJson = client.Search<string>(...);
But if you want the typed results as well then it's slightly more involved. You need to call DisableDirectStreaming() on the settings object and then retrieve the raw json from response.ApiCall.ResponseBodyInBytes as demonstrated here.
var settings = new ConnectionSettings(new Uri("http://localhost:9200"))
.DefaultIndex("index1")
.DisableDirectStreaming();
var response = new ElasticClient(settings)
.Search<object>(s => s.AllIndices().AllTypes().MatchAll());
if (response.ApiCall.ResponseBodyInBytes != null)
{
var responseJson = System.Text.Encoding.UTF8.GetString(response.ApiCall.ResponseBodyInBytes);
}
Elasticsearch.Net allows you to return the response stream directly,
var search = client.Search<Stream>(new { size = 10 });
.Search() has many overloads to limit its scope by index and type.
This will return an IElasticsearchResponse<Stream> where you can pass the response stream directly to the deserializer of your choide (SS.Text in your case) without the client buffering in between.
I have a program that deserializes large objects from a web service. After a webservice call and a 200, the code looks like this.
JsonConvert.DeserializeObject<List<T>>(resp.Content.ReadAsStringAsync().Result).ToList()
Sometimes while running this process I will get an aggregate exception which shows an inner exception as out of memory. I can't determine if it is the process of reading in the string of JSON data (which is probably awfully large) or the Deserializing that is causing this issue. What I would like to do is break out the string and pull each JSON object back individually from the response and then deserialize it. I am just having trouble finding a way to only bring out one JSON object at a time from the response. Any suggestions are greatly appreciated!
HttpClient client = new HttpClient();
using (Stream s = client.GetStreamAsync("http://www.test.com/large.json").Result)
using (StreamReader sr = new StreamReader(s))
using (JsonReader reader = new JsonTextReader(sr))
{
JsonSerializer serializer = new JsonSerializer();
// read the json from a stream
// json size doesn't matter because only a small piece is read at a time from the HTTP request
Person p = serializer.Deserialize<Person>(reader);
}
https://learn.microsoft.com/en-us/xamarin/xamarin-forms/data-cloud/web-services/rest contains a warning:
Using the ReadAsStringAsync method to retrieve a large response can
have a negative performance impact. In such circumstances the response
should be directly deserialized to avoid having to fully buffer it.