Difference between a .csv and a .txt file - c#

What is the difference between these two file format.
i found this from Here
.txt File:
This is a plain text file which can be opened using a notepad present on all desktop PCs running MS Windows any version. You can store any type of text in this file. There is no limitation of what so ever text format. Due to ease of use for end users many daily data summery providers use .txt files. These files contain data which is properly comma seperated.
.csv File: abreviation of "comma seperated values"
This is a special file extension commonly used by MS Excel. Basically this is also a plain text file but with a limitation of comma seperated values. Normally when you double click this type of file it will open in MS Excel. If you do not have MS Excel installed on your computer or you find Notepad easy to use then you also can open this file in a notepad by right clicking the file and from the menu select "Open With" and then choose notepad.
My Question :
what does means comma seperated value?
if i'm going to create .csv file using c#, does i need to write file using StreamWriter and does it need to only change the the extention to .csv?
if so do i need to change the writing string with commas?
thanx....

what does means comma seperated value?
Values separated by Comma, for example.
Name,Id,3,Address
if i'm going to create .csv file using c#, does i need to write file
using StreamWriter and does it need to only change the the extention
to .csv?
Changing extension of the file will help you in opening it in MS Excel, other than that it can be anything and you can still open it through your code (StreamReader)
if so do i need to change the writing string with commas?
Yes, separate your values with Comma, or you can use any other delimiter you like. It can be semicolon ; as well since , in some languages/cultures is used for decimal separator.

CSV is structured like this:
"value","value1,"value2"
A text file can be anything from delimited, to free form , fixed width, jagged right, etc...
CSV files can be a pain in the ass if you have commas in your data, and don't properly qualify the values.
I typically create tab delimited or pipe delimited files.

From the perspective of programming, file extensions do not make a difference. In fact you may write comma seperated values inside a txt file.
Comma seperated values indicates the values are just seperated with commas; this is helpful if you want to store some data and share it accross multiple systems (on the otherhand XML is a better option).
Assume you need to store name, age and location;
TilT,25,Germany
is a comma seperated data.
In the scope of c#, you need to add commas between your values and you may save it as a CSV file or a TXT file; it makes no difference.

Related

Convert CSV to CSV MS-Dos Extension

I have a process in SSIS that outputs SQL table data to CSV format. However, I want the output CSV in CSV (MS-DOS). Is there a way I can convert the normal CSV file to CSV (MS-DOS) ? (Like C# code that would convert the extension/type) . I tried using the option available in visual studio in SSIS, and couldn't find the solution towards it. Your help is appreciated.
By default, the output format is in CSV(Comma delimited, highlighted blue). I want that to be converted to CSV(MS-DOS, highlighted yellow).
If this article is accurate, https://excelribbon.tips.net/T009508_Comma-Delimited_and_MS-DOS_CSV_Variations.html then getting an CSV (MS-DOS) output will be fairly straight-forward
if you have certain special characters in text fields; for example, an accented (foreign language) character. If you export as Windows CSV, those fields are encoded using the Windows-1252 code page. DOS encoding usually uses code page 437, which maps characters used in old pre-Windows PCs.
Then you need to define 2 Flat File Connection Managers. The first will use 1252 (ANSI - Latin I) as your code page and point to C:\ssisdata\input\File.csv. The second will use 437 (OEM - United States) and point to C:\ssisdata\input\DOSFile.csv (this way you create a new file instead of clobbering the existing.)
Your Data Flow then becomes a Flat File Source to Flat File Destination.

Is csv with multi tabs/sheet possible?

I am calling a web service and the data from the web service is in csv format.
If I try to save data in xls/xlsx, then I get multiple sheets in a workbook.
So, how can I save the data in csv with multipletab/sheets in c#.
I know csv with multiple tabs is not practical, but is there any damn way or any library to save data in csv with multiple tabs/sheet?
CSV, as a file format, assumes one "table" of data; in Excel terms that's one sheet of a workbook. While it's just plain text, and you can interpret it any way you want, the "standard" CSV format does not support what your supervisor is thinking.
You can fudge what you want a couple of ways:
Use a different file for each sheet, with related but distinct names, like "Book1_Sheet1", "Book1_Sheet2" etc. You can then find groups of related files by the text before the first underscore. This is the easiest to implement, but requires users to schlep around multiple files per logical "workbook", and if one gets lost in the shuffle you've lost that data.
Do the above, and also "zip" the files into a single archive you can move around. You keep the pure CSV advantage of the above option, plus the convenience of having one file to move instead of several, but the downside of having to zip/unzip the archive to get to the actual files. To ease the pain, if you're in .NET 4.5 you have access to a built-in ZipFile implementation, and if you are not you can use the open-source DotNetZip or SharpZipLib, any of which will allow you to programmatically create and consume standard Windows ZIP files. You can also use the nearly universal .tar.gz (aka .tgz) combination, but your users will need either your program or a third-party compression tool like 7Zip or WinRAR to create the archive from a set of exported CSVs.
Implement a quasi-CSV format where a blank line (containing only a newline) acts as a "tab separator", and your parser would expect a new line of column headers followed by data rows in the new configuration. This variant of standard CSV may not readable by other consumers of CSVs as it doesn't adhere to the expected file format, and as such I would recommend you don't use the ".csv" extension as it will confuse and frustrate users expecting to be able to open it in other applications like spreadsheets.
If I try to save data in xls/xlsx, then I get multiple sheets in a workbook.
Your answer is in your question, don't use text/csv (which most certainly can not do multiple sheets, it can't even do one sheet; there's no such thing as a sheet in text/csv though there is in how some applications like Excel or Calc choose to import it into a format that does have sheets) but save it as xls, xlsx, ods or another format that does have sheets.
Both XLSX and ODS are much more complicated than text/csv, but are each probably the most straightforward of their respective sets of formats.
I've been using this library for a while now,
https://github.com/SheetJS/js-xlsx
in my projects to import data and structure from formats like: xls(x), csv and xml but you can for sure save in that formats as well (all from client)!
Hope that can help you,, take a look on online demo,
http://oss.sheetjs.com/js-xlsx/
peek in source code or file an issue on GH? but I think you will have to do most coding on youre own
I think you want to reduce the size of your excel file. If yes then you can do it by saving it as xlsb i.e., Excel Binary Workbook format. Further, you can reduce your file size by deleting all the blank cells.

How to read comma file (.csv) and tab separated file(.txt) by a single Flat File Source SSIS Data Flow Item?

I'm new to SSIS and in one of my project I need to read both comma separated file (.csv) and tab separated file (.txt) by a single Flat File Source. Unfortunately I cannot make .txt file content as comma separated because this is how it is generated by a separate source file providing system neither have an option to modify the file content before supply to SSIS. I'm expecting some help and support of your experience.

How to put HH:mm:ss formatted times in CSV so Excel formats them as regular text?

I'm writing times into a CSV file time using the format HH:mm:ss. When excel opens the file it automatically recognizes this column format as time. Is it possible to prevent Excel from doing this so the column is formatted as regular text?
Thank you
There was recently a similar question on SuperUser: link
The same principle as the accepted answer can be employed here if all you want is a CSV that can be opened with Excel without the ill-effects of autoformatting. You'll need to write your values to the CSV in this format:
="yourdatetimehere"
Of course the downside is that the equal signs and quotation marks will be stored in your CSV as text. This means that this will probably cause problems for you if you plan to use the CSV in any context outside Excel. But as a hack to get around Excel's autoformatting, this should work.
You have no control over formatting in a CSV file, unless you want to go through the full custom-import setup in Excel each time.
If you want to force Excel to treat something as text, then use a proper Excel file, generated using PHPExcel

How to get rid of special characters at the beginning, while using File.ReadAllLines in C#

I tried string[] file = File.ReadAllLines(file_name) to read a word file.
In debug mode i found that the first few arguments of the string array file are having values like
"��ࡱ�0\0\0\0>\0\0��\t\0\0\0\0\0". How can i get rid of this.
In certain files the first 3 arguments of the file[] are filled with these while for few files only the first argument is filled with these unreable characters.
What is the problem and how can i get rid of this.? But my word file does not even have a blank line at the beginning.
The problem is you're not opening the file with the correct encoding. Here is a guide to opening and creating Word documents from C#.
File.ReadAllLines is intended for text files. Word files are not text files. To read Word files you might need a library.
If you are using .NET 3.5 then I'd suggest that you use a LINQ where clause to return only the lines that you're interested in.
string[] file = File.ReadAllLines(file_name).Where(line => !line.StartsWith("��")).ToArray();
You could also use some form of regular expression instead of the line.StartsWith() method.
Note: If you are reading Microsoft Office Word files I'd recommend that you use the COM Interop or 3rd party library to read the MS Word Document (you'll find it much easier than trying to parse the file yourself).
Word files are not simple text files, so will have additional binary information embedded.
You should use a library that will read word documents if you want to extract the text properly, instead of File.ReadAllLines.
Here are a couple of such libraries.

Categories

Resources