I'm receiving a .zip file from a server.
The .zip file is sent 64Base encoded and it contains an XML file.
After I decode the data to binary using Convert.FromBase64String, can I convert the byte array to XML?
I don't want to deal with unzipping.
I tried the following code: (that resulted in Gibberish that doesn't make any sense and doesn't look like XML at all)
XmlDocument doc = new XmlDocument();
string xml = Encoding.UTF8.GetString(buffer);
doc.LoadXml(xml);
Any ideas?
You say you don't want to unzip, but do you actually mean that you don't want to unzip to disc? Most zip libraries either allow you to unzip a file to a byte array directly or to a stream where you could pass it a MemoryStream.
There's no getting around having to uncompress. Unless you have control over the server side, then you could change the format to an uncompressed file (like a tar file). Then you wouldn't have to uncompress.
You say:
I'm receiving a .zip file from a server.
And:
I don't want to deal with unzipping.
Well. You have to. If the data is in a zip archive, you need to extract it first. You can't just ignore the fact.
There are plenty of zip libraries - sharpziplib is free and easy enough to use.
Related
As title,
I want to encrypt 10 files into one file and the extension file can be customized. Once encrypted, I will automatically decrypt it to create 10 files as originally. Does anyone have any ideas?
Just as any other stream where you can't discern messages, you need a delimiter of some sorts. Prepending the length of stuff is a common way to do so.
So make up a file format specification, for example:
Start the file with an uint32 which specifies the number of files in the archive
Then, per file:
Write an uint32 specifying the file name length in bytes in the encoding you want to use (I'd go for UTF-8)
Write the file name's bytes
Write an uint32 specifying the file data length
Write the file data
Then when reading the file, read the uints and extract the next bytes.
But you generally don't want to invent your own file format.
Possbile solution
Encrypt all the files
Store the encrypted data (byte[]) in a List<byte[]> or store it separatly in temp-files (maybe file.oldext.newext.tmp)
Create a new XML file, write all the data to it (create a list within the XML file and store the data in the list elements; One element per file)
Save the XML file to the disk (newfilename.newext)
We have a very old file delivery application(IPGear, if you have heard about it, written in tcl). We upload our IP files there and our customers download it from the system.
When you upload a file to this application, it adds .RCA extension to uploaded file and add some metadata to file. if we view the content of any file in a text editor(Usually tgz, pdf and text files), we see some metadata added to the top of the file by the application(5-10 lines, readable).
If you download a file from the system, they somehow strip this metadata from the file and returns as TGZ file which works fine(we can extract it)
if we find that RCA file on the storage where this application keeps files and edit the metadata they have added via text editor, we are able to extract the file without any problem., which fine too. But we need to do this process for 22K files, therefore we need to script it.
We are able to find the bits the application adds by opening via StreamReader, and strip the metadata and write file to the disk via StreamWriter. However, the file we write to the system is corrupted if it is TGZ file. if we do same thing for text files, they work.
the content of the tgz file looks below when we open in text editor
The bits on lines 29-38 are the metadata we strip.
it looks like the streamreader is not able to write this content back to disk even if we tried different encoding settings.
One another note about this is that the file we are trying to read and write is copied from a Solaris based server into local machine(Windows 7) via WinSCP.
So, my question is, what is the best way of reading TGZ file into memory(as text) so manipulation, and save back without corruption? is streamreader and streamwriter not good for this purpose?
I tried to give as much information as I can, please add comments if you need more clarification.
it looks like the streamreader is not able to write this content back to disk even if we tried different encoding settings.
Yes, because a tgz file isn't plain text. StreamReader and StreamWriter are for text content, not arbitrary binary content.
So, my question is, what is the best way of reading TGZ file into memory(as text)
You don't. You read it as binary data, because it is binary data.
If the TGZ archive contains text files, you'll need to decompress the TGZ to the TAR format, then extract the relevant data from that. Then you can work with it as text. Before that point, it's just binary data.
But it sounds like you actually may just want to read text information before the TGZ file... in which case you need to work out where that text information ends, and not read any of the TGZ file as text (because it's not). This is non-trivial, but if you know that the text is in ASCII it'll be a bit easier - you will need to work out how to detect the end of the text and the start of the real content though, and we can't really tell that from the screenshot you've given.
I am developing an application which takes the Back Up of Docx file. For the Initial Back Up I copy the entire file in the destination, but next time I want to perform an incremental Back Up i.e I want to backup only that segment of the Docx file that has undergone changes. I need to find the most efficient to do the same.
I would really be thankful if I get any help in this regard.
The DOCX file is different from the previous Microsoft Word programs, which use the file extension DOC, in the sense that whereas a DOC file uses a text or binary format for storing a document, a DOCX file is based on XML and uses ZIP compression for a smaller file size. In other words, a DOCX file is a set of XML files that have been compressed using ZIP.
It might help if you can use ZipFile to dissect and tell which file is really changed and then incrementally save only the changes in your VCS.
I'm doing a project in C# which requires me to encrypt a wave file. So, is there a straight forward process to convert a wave file into binary and back? I'll be applying an encryption algorithm on the binary data.
Sure, checkout the File.ReadAllBytes and File.WriteAllBytes methods. Or if the files are too large to fit in memory you could read/write in blocks using a FileStream.
You can use FileStream to read a binary file, such as WAV file and then another FileStream to write the encrypted version back out.
You need to read and write the files in blocks using a byte array.
I have an XML file format .zfo that is compressed using zip algorithm. I need to remove this compression from the file, so that it is in usable XML form. Here is the file.
How can I remove this compression, or decompress this XML file?
It's not like you might imagine i.e: .zip file containing an xml file. Instead the byte[] that's written to the file is zip compressed.
Thanks in advance.
That file isn't zip compressed at all. It appears to be some xml that's embedded in a certificate, issued by the Czech Post Office. The actual message looks to be encoded in some kind of base64 variant.
Call your post office.
Check out DotNetZip (http://www.codeplex.com/DotNetZip)--it probably does what you need (e.g., DeflateStream).
A zip file contains meta-data (file and directory structure) as well as the actually compressed data. It sounds like your file only has the compressed data. DotNetZip should be able to handle both.