I'm a junior software developer trying to learn more about web development and for now I'm wondering about this thing.
I'm sending a file trough multiple SOAP web requests, in chunks of 100.000 bytes, the first chunk's request has "start" operation, the other ones have "next" and the last chunk is sent with an "end" operation in the request's envelope. In the image below you can see how the envelope look.
I cannot check the file's content myself but I can only see the uploaded file's SIZE.
The uploaded file's size is always a quarter of the actual file size that I've sent.
For example if I have a 300.000 bytes file to send, the uploaded file's size will be approximately of size 75.000 . A proper example can be seen below.
I'm not sure if I actually understand how this works or even if the file is uploaded correctly.
If someone could explain this to me I would greatly appreciate it. :D
Thank you a lot for your time!
PS: I tweaked with the chunkLength parameter ( from the envelope ) or the actual buffer size ( trying 1000 bytes per request or even 100.000 ), the result is the same. :)
PS2: The data is a random string that I first gzip compress it then send it trough the request mentioned before.
Related
My Web API Already published.
One of our application specification is upload an image (there are 3 images can uploaded). From the front end we already compressed the image from (at least) 3 MB into 500 KB. and there is no issue with that and when post or get data just need 1,2 - 1,3 seconds.
From frontend we are using BASE64.
But Our User request to us some Enhancement where they need to upload an image (without size & Resolution compressing) with original size, where the size of an one image is 3MB (at least).
we have tried with existing Web API and there is a major issue occur, which is our application cannot post the data into backend.
My assumption is there need a long time to upload a large image and then the connection timed out.
If the my assumption is right, so please help me, How to increase the time out when large images are uploaded ?
I had tried to use form-data, but still take long time to upload the large images.
this my JSON. Sorry, I can just upload image.. cause limit characters in stackoverflow that can post.
I am new to web service development. I am working on a web service where I need to upload image on server using web service which will be called from an Android application.
My problem is, when user selects image from Android device and clicks on upload button at that time it generates string using Base64 class file. And its length is more than 3000 characters.
So, I need to reduce the size of string which is generated using Base64 for image.
Base-64 is, by definition, a fixed size representation; n bytes binary will always be m characters base-64. The most you can do is remove the final chunk's padding, which will save you a whopping 3 characters maximum.
If you want your payload to be shorter, you could:
use a higher base (although you'll need to write the encode/decode manually, and you'll still only be able to get to about base-90 or something, so you won't drastically reduce the size)
use a pure binary post body rather than base-anything (this is effectively base-256)
have less data
by having smaller images
or by using compression (note: many image formats are already compressed, so this often won't help)
use multiple requests
Generally speaking, to reduce data size without losing any information you'll need lossless compression. A good starting point there might be Android's built-in zip classes, but you'll still need encoding across the wire.
If it's a captured image, however, changing the parameters of the JPEG compression (or original resolution) will prove far more useful, as the compression you'll get on JPEG-like data which is then base-64'd are likely to be very low.
You need to reduce your actual image size first to be able to reduce your base64 size.
I would like to POST the data from a large file downloaded from a url. Currently I am doing this by first saving the file locally, and then posting that file's data to another Api.
Is it possible to read from the file download request, and post to the second request without first having to save the file to disk?
Saving to disk first is not necessary but depending on the size of the file you may have problems. All you do is send a byte array but depending on the size you may have issues and saving to disk first may be more manageable.
I plan to send and receive file with a microcontroller. I wrote up a simple protocol for both sender and receiver, but I have some trouble reconstructing the file back. I send the data in a stream of raw binary. However, I have not found the location of fileinfo (name, ext, size, etc.) in the file itself. Where is the fileinfo stored in the file? How does the OS know all these information if it isn't store in the file? (for e.g. name, extension, size, etc.)
Trivial question: Should I attach this file information with the protocol header? or should I just append it onto the file binary data?
You need to attach that information to your binary data yourself. If you have a binary stream, I suggest (it's easiest) you provide a fixed size header that contains all the file meta information. Then you append the file's content.
Why fixed size? Well, otherwise the receiver doesn't know where the file's content starts. You could also provide the header size in the first X bytes of the stream and then have a variable sized header. As you like it, but I prefer the fixed size solution.
Example for fixed size header:
<255 bytes file name><8 bytes file size><Content...>
Example for dynamically sized header:
<4 bytes length of file name><x bytes file name><8 bytes file size><Content...>
Let me stress that it is very important that you also transmit the size of the content in bytes, so that the receiver knows how many bytes to read! Packets may be fragmented, you know?
How does your self-made "protocol" work?
It is quite uncommon for files to store their own size, it is a responsibility of the underlying file system to keep track of that (name including extension, size, permissions, modification time, ...).
You can put the size information in the header, or if you are sure that a certain sequence of bytes is never sent as payload, you can use this as a termination sequence to tell the receiver to stop receiving.
Is it possible to read the contents of a .ZIP file without fully downloading it?
I'm building a crawler and I'd rather not have to download every zip file just to index their contents.
Thanks;
The tricky part is in identifying the start of the central directory, which occurs at the end of the file. Since each entry is the same fixed size, you can do a kind of binary search starting from the end of the file. The binary search is trying to guess how many entries are in the central directory. Start with some reasonable value, N, and retrieve that portion of the file at end-(N*sizeof(DirectoryEntry)). If that file position does not start with the central directory entry signature, then N is too large - half and repeat, otherwise, N is too small, double and repeat. Like binary search, the process maintains the current upper and lower bound. When the two become equal, you've found the value for N, the number of entries.
The number of times you hit the webserver, is at most 16, since there can be no more than 64K entries.
Whether this is more efficient than downloading the whole file depends on the file size. You might request the size of the resource before downloading, and if it's smaller than a given threshold, download the entire resource. For large resources, requesting multiple offsets will be quicker, and overall less taxing for the webserver, if the threshold is set high.
HTTP/1.1 allows ranges of a resource to be downloaded. For HTTP/1.0 you have no choice but to download the whole file.
the format suggests that the key piece of information about what's in the file resides at the end of it. Entries are then specified as an offset from that particular entry, so you'll need to have access to the whole thing I believe.
GZip formats are able to be read as a stream I believe.
I don't know if this helps, as I'm not a programmer. But in Outlook you can preview zip files and see the actual content, not just the file directory (if they are previewable documents like a pdf).
There is a solution implemented in ArchView
"ArchView can open archive file online without downloading the whole archive."
https://addons.mozilla.org/en-US/firefox/addon/5028/
Inside the archview-0.7.1.xpi in the file "archview.js" you can look at their javascript approach.
It's possible. All you need is server that allows to read bytes in ranges, fetch end recored (to know size of CD), fetch central directory (to know where file starts and ends) and then fetch proper bytes and handle them.
Here is implementation in pyhon: onlinezip
[full disclosure: I'm author of library]