C# Windows Application input text file format - c#

I need a help in developing a Windows Appl using C#.NET VS2010. The functionality is very simple, the user will input a text file and my program is supposed to extract the relevant data from the text file and output it to either csv or text or whatever.
My biggest problem whenever I deal with text files is the format. Even though if you open the input text file in a Notepad or Wordpad it looks perfect, the layout etc. But once we start programming it I realize that what I am seeing is not the way the data is stored inside the file. I read many articles on Unicode/UTF etc.. etc.. but I dont have a definite solution to know exactly what my file format is. So the end result is that I end up getting many exceptions.
In Unix Shell Scripting it used to be simple. There is some good Unix command like less which is similar to more but it also display any formatting characters inside the file. Also there are some useful commands like unix2dos and dos2unix.
Nevertheless, is there some program/code or professional method which can find the exact file formatting of my input file and then reformat it to "plain text" so that the data extraction becomes easy and bug-free.
Thanks

Related

Copy text from Word and get file location from clipboard

I want to achieve the following:
I copy some text in a Word application (or another application where the clipboard is updated with text). I now want to know the location path of the Word document, so I can store the path and open the document for a reference later on.
I would like to do it for websites as well, so I can get the website where the text was copied from.
I want do this in C# and Windows 10. My initial thought was to create a CTRL + C event listener, and find the active application and get the location like that. But I cant link the copied text, and the text path together.
Any ideas out there?
You can retrieve such information, but with limitaions. When there is no such information stored in clipboard you're out of luck, but, hopefully for you, many applications store way more data in clipboard along with text, including path or url to the document.
Different applications using different different formats to track document location, so main idea is to try read from clipboard all relevant clipboard formats that include document location one by one and try to extract location.
Here several clipboard formats that contains or can contain information needed to you:
HTML Format
msSourceUrl
FileName and FileNameW
UniformResourceLocator and UniformResourceLocatorW
ObjectLink
Hyperlink
etc.
You can find more about different clipboard formats here. Also you can use any clipboard format viewer to view what is actually stored in clipboard by different applications.
For example, all modern browsers and all Microsoft Office suite applications store in clipboard actual document location in HTML Format as simple plain text:
Version:1.0
StartHTML:000000271
EndHTML:000008359
StartFragment:000008219
EndFragment:000008255
StartSelection:000008219
EndSelection:000008255
SourceURL:http://stackoverflow.com/questions/42672385/copy-text-from-word-and-get-file-location-from-clipboard-c-sharp
...
You can't do this, I am afraid. This information is not made available in the clipboard.
Even if you listen to Ctrl + C and find an active instance of Word running, you still won't be 100% you get what you need. It might be a new document, which is not even saved yet on disk. Even more convoluted case: the user copies some text from an edit field on a dialog in Word.

Program to put text into LibreOffice document

I'm trying to make a program in C# which should put text into an opened LibreOffice document (Writer).
A first the user can make some decisions about the text (saved to string variables) and when clicking on a button it should put the text from these strings to the document.
How can I do that?
Libre Office uses Open Document Format (ODF) (its actually an XML based format and is usually compressed by using zip) which is an easy format to work with ,I have found AODL to be the only openSource library (check below links) and I'm also sure .NET libraries can do the heavy lifting for you, here are some tutorials and links to help you out.
AODL allows your application to support the OpenDocument Format.(OpenSource .NET Library)
How to Read and Write ODF/ODS File
Read and write ODF/ODS files (OpenDocument Spreadsheets)

Stegnography - hide text file within another text file in c#.net

I have already seen solution for hiding text files or messages within Image or audio files..
but i want solution for hiding text file within another text files (.txt, .doc, .pdf).
can somebody help for this??
Steganography is based on slightly changing data to "hide" some other set of data within these changes. That's why an image with steganography is slightly different than the original. You can't notice if if you don't know it's there, but the fact is you saved the data as changes within color information of pixels.
.txt file is nothing else than a big hunk of characters. If you tried to somehow change the data to hide something in it, it would result in unreadable text. If you change the color of a pixel from 215 Red to 217 Red, you won't really notice. But changing A to F or Ł is quite noticable.
So no, I don't believe it can be done. At least not with .txt files.
While I agree with #stonehead that at the end of the day if you put something in the file someone can find it, but there are a few tricks out their that may prove to be viable options.
Since most users are not living in their command prompt the most straight forward approach is to misrepresent the file to the GUI. This is a pretty handy trick for this.
http://www.howtogeek.com/howto/windows-vista/stupid-geek-tricks-hide-data-in-a-secret-text-file-compartment/
If you are storing data in a pdf you should have very little problems. I would use PdfClown. Not to get too into it but you will want to read up about the structure of a pdf. With clown pdf you could store an asset inside with no connection to the presentation layer. Given the complexity of pdf files i will almost bet no one will be looking in, i would base64encode the chunk to have it blend in with images and other data it would be difficult for someone to find it by just opening up the file.
Be For Warned ClownPDF C# library is not for the faint of heart and it will help to have some java experience because a lot of their docs are for java.
Hope these options help.

C# Text File upload and download issue

Okay so I have an application which uploads a text file to a web server and all works fine.
However, a line magically appears on the text file when it is downloaded
example:
textfile contains = Hello World
downloaded textfile contains = //notice the blank line here
Hello World
Normally this wouldnt be a problem as I would just create a temp file and delete the line.
However, as the text file contains encrypted data and if I create a new temp file to delete the line it completely messes with the encrypted text and creates
"Bad Data" and "length of data to decrypt is invalid" errors.
Im almost 100% sure its not my encryption algorithm as the text files are output before they are uploaded and it works fine on the non uploaded text files.
If you guys could help me that would be awesome. Any work around will do (no matter how horrible / nasty it is).
Does the server and client run the same family operating system? I'm thinking that this may be due to newline sequence differences, and uploading and downloading in different modes (text/binary).
If the data is encrypted or cryptographically signed, you want to do everything you can to make sure the transfers are done in binary mode.
What does the download code look like?
Making a wild guess: you are Response.Write()ing the text, without a Response.Clear() to clear any "aspx text". Plus you need that code to end on a Response.End() to prevent further additions to the text.
It looks like your encryption algorithm is appending your text with null terminated string.
Try loading the text file on you webserver in a byte array and see if last byte is '\0'.
There are two reasons something like this can happen.
You are making some changes on upload(like parsing the text and some amount of data manipulation,where you introduce this line)
You are readingthe file and manipulating it before you download it...
Check both the code and post some samples if you are actually manipulating it. I have uploaded files using c# and it works fine.
You should check Hanselman's blog for a simple upload application...It is straight forward.

Edit Text in SWF File

I am wondering the capabilities of the SWF Format. I have some text in a Flash Video, which is an email address (xyz#somewhere.com) which I would like to write a C# application to edit. I have the SWF File Format Spec and was wondering if the following is possible:
If I read in the Tag (I am assuming this text is stored as a Static Text Tag, correct me if I am wrong). Once I found the correct tag for my text I then change the value in the tag and save the SWF file. Do you see any problems with this approach?
Chris
Yes, almost certainly there is more to this.
First, the swf is most likely compressed and will need decompressing.
Locating the value you need to change should not be too hard, but you will need to change at least one (probably several) field lengths to cater for the change. This in turn will probably require you to parse the rest of the SWF in order to recalculate the lengths of the various blocks that will be altered.
Not for the faint-hearted.

Categories

Resources