Weird Issue in Excel When Saved as ==> Csv - c#

I have an excel file which contains some data when I save that file to CSV then some weird ? marks appear before & end of the text. Will any 1 please tell me how can I resolve that issue.
?XXXXXX-XXX?
Above is the link to download excel file : http://www.filedropper.com/book1_5

In this file, in the column C you've got following data:
"‭0000468750-IN‬"
"‭0000468750-IN‬"
"‭0000843576AB‬"
"‭0000843576AB‬"
It is not reslly visible now, but at start and end of every number you have there an additional invisible whitespace character. You may see it for yourself, just edit that cell and move through the text by directional arrows - it will make a little pause when moving over that invisible character. If I replace it with an underscore, it looks like that:
"_0000468750-IN_"
"_0000468750-IN_"
"_0000843576AB_"
"_0000843576AB_"
If my text editor doesn't cheat on me, that character has code 0x00, and it's called null-character.
When converting to CSV, Excel didn't know what to do with that character. CSV is a textfile and must follow some encoding rules. For example, if you saved it as CSV/ANSI, then it's not possible to store some Unicode characters like ąęćżń. Similarly, it's usually not possible to store a 0x00 character in a textfile at all, because this character is special in most encodings. With this character inside, such textfile could be detected as "binary file" by readers and rejected.
Excel simply replaced that odd charcter with "?" character to make the data safe for CSV format. Excel didn't just erase the 0x00 character to let you know that there was something odd in the original data.
It's very strange to see it in textual data. If this XLSX was generated by a computer program, it might indicate that this program has some bugs/errors. I highly doubt this file to be manually created. It's really hard to write "0x00" character by hand. One option I can think of when you could get this manually is by using a crappy barcode reader, and scanning the codes right into the Excel sheet. The barcode scanning software sometimes leaks the control characters into the textdata stream. If that's the case, change the reader or write a filter that will cut those chars out.
Btw. you should be able to just find&replace all that strange characters. Edit one of the cells (F2 key), go to the end of the text (END key) select the LAST character of the text (Shift + LeftArrow ONCE), copy that character (Control + C), then open Find&Replace window (Control + H) and paster that character into "Find" and press "Replace All".
On my Excel this resulted in finding/replacing 8 such characters, so it works.
Note that after the END key you must press ShiftLeft exactly ONCE. The cursor will not move and nothing will happen, no selection will show up. That's because the character is invisible. But it is there, and it will be selected and copied.

Related

Why does the Notepad++ [NULL] character not paste?

I am new to this site, and I don't know if I am providing enough info - I'll do my best =)
If you use Notepad++, then you will know what I am talking about -- When a user loads a .exe into Notepad++, the NUL / \x0 character is replaced by NULL, which has a black background, and white text. I tried pasting it into Visual Studio, hoping to obtain the same output, but it just pasted some spaces...
Does anyone know if this is a certain key-combination, or something? I would like to put the NULL character in replacement of \x0, just like Notepad++ =)
Notepad++ is a rich text editor unlike your regular notepad. It can display custom graphics so common in all modern text editors. While reading a file whenever notepad++ encounters the ASCII code of a null character then instead of displaying nothing it adds the string "NULL" to the UI setting the text background colour to black and text colour to white which is what you are seeing. You can show any custom style in your rich text editor too.
NOTE: This is by no means an efficient solution. I'm clearly traversing a read string 2 times just to take benefit of already present methods. This can be done manually in a single pass. It is just to give a hint about how you can do it. Also I wrote the code carefully but haven't ran it because I don't have the tools at the moment. I apologise for any mistakes let me know I'll update it
Step 1 : Read a text file by line (line ends at '\n') and replace all instances of null character of that line with the string "NUL" using the String.Replace(). Finally append the modified text to your RichTextBox.
Step 2 : Re traverse your read line using String.IndexOf() finding start indexes of each "NUL" word. Using these indexed you select text from RichTextBox and then style that selected text using RichTextBox.SelectionColor and RichTextBox.SelectionBackColor
richTextBoxCursor basically just represents the start index of each line in RichTextBox
StreamReader sr = new StreamReader(#"c:\test.txt" , Encoding.UTF8);
int richTextBoxCursor = 0;
while (!sr.EndOfStream){
richTextBoxCursor = richTextBox.TextLength;
string line = sr.ReadLine();
line = line.Replace(Convert.ToChar(0x0).ToString(), "NUL");
richTextBox.AppendText(line);
i = 0;
while(true){
i = line.IndexOf("NUL", i) ;
if(i == -1) break;
// This specific select function select text start from a certain start index to certain specified character range passed as second parameter
// i is the start index of each found "NUL" word in our read line
// 3 is the character range because "NUL" word has three characters
richTextBox.Select(richTextBoxCursor + i , 3);
richTextBox.SelectionColor = Color.White;
richTextBox.SelectionBackColor = Color.Black;
i++;
}
}
Notepad++ may use custom or special fonts to show these particular characters. This behavior also may not appropriate for all text editors. So, they don't show them.
If you want to write a text editor that visualize these characters, you probably need to implement this behavior programmatically. Seeing notepad++ source can be helpful If you want.
Text editor
As far as I know in order to make Visual Studio display non printable characters you need to install an extension from the marketplace at https://marketplace.visualstudio.com.
One such extension, which I have neither tried nor recomend - I just did a quick search and this is the first result - is
Invisible Character Visualizer.
Having said that, copy-pasting binaries is a risky business.
You may try Edit > Advanced > View White Space first.
Binary editor
To really see what's going on you could use the VS' binary editor: File->Open->(Open with... option)->Binary Editor -> OK
To answer your question.
It's a symbolic representation of 00H double byte.
You're copying and pasting the values. Notepad++ is showing you symbols that replace the representation of those values (because you configured it to do so in that IDE).

Mixture of RTL and LTR in windows file name

I am trying to write a file to disk (in Windows) that contains both RTL (right to left) and LTR (left to right) text.
The filename is composed from different bits of data eg:
{data_part1} - {data_part2} - {data_part3} - {data_part4}.{extension}
Any of the data parts could be RTL or LTR.
What I have noticed is that if data part2 is RTL and data part3 is numeric, data part 2 appears in the position where data part 3 should be and also causes data part 3 to be printed RTL.
If data part 3 is non-numeric (ie a word such as 'hello') this problem doesn't occur.
However, if I copy that file name and paste it into word it appears correctly?
Which implies that Windows explorer is not displaying the text correctly
I have tried using the POP DIRECTIONAL FORMATTING character but that hasn't made a difference.
Has Anyone else had this issue and does anyone have any ideas of getting around it?
string fileName = '\u200E'+ dataPart1+ '\u200E'+ dataPart2+ ... '\u200E'+ dataPart4+ extension

Export datatable to Excel asp.net : How to Format excel cells to text in Response.Write()? [duplicate]

Does anyone happen to know if there is a token I can add to my csv for a certain field so Excel doesn't try to convert it to a date?
I'm trying to write a .csv file from my application and one of the values happens to look enough like a date that Excel is automatically converting it from text to a date. I've tried putting all of my text fields (including the one that looks like a date) within double quotes, but that has no effect.
I have found that putting an '=' before the double quotes will accomplish what you want. It forces the data to be text.
eg. ="2008-10-03",="more text"
EDIT (according to other posts): because of the Excel 2007 bug noted by Jeffiekins one should use the solution proposed by Andrew: "=""2008-10-03"""
I know this is an old question, but the problem is not going away soon. CSV files are easy to generate from most programming languages, rather small, human-readable in a crunch with a plain text editor, and ubiquitous.
The problem is not only with dates in text fields, but anything numeric also gets converted from text to numbers. A couple of examples where this is problematic:
ZIP/postal codes
telephone numbers
government ID numbers
which sometimes can start with one or more zeroes (0), which get thrown away when converted to numeric. Or the value contains characters that can be confused with mathematical operators (as in dates: /, -).
Two cases that I can think of that the "prepending =" solution, as mentioned previously, might not be ideal is
where the file might be imported into a program other than MS Excel (MS Word's Mail Merge function comes to mind),
where human-readability might be important.
My hack to work around this
If one pre/appends a non-numeric and/or non-date character in the value, the value will be recognized as text and not converted. A non-printing character would be good as it will not alter the displayed value. However, the plain old space character (\s, ASCII 32) doesn't work for this as it gets chopped off by Excel and then the value still gets converted. But there are various other printing and non-printing space characters that will work well. The easiest however is to append (add after) the simple tab character (\t, ASCII 9).
Benefits of this approach:
Available from keyboard or with an easy-to-remember ASCII code (9),
It doesn't bother the importation,
Normally does not bother Mail Merge results (depending on the template layout - but normally it just adds a wide space at the end of a line). (If this is however a problem, look at other characters e.g. the zero-width space (ZWSP, Unicode U+200B)
is not a big hindrance when viewing the CSV in Notepad (etc),
and could be removed by find/replace in Excel (or Notepad etc).
You don't need to import the CSV, but can simply double-click to open the CSV in Excel.
If there's a reason you don't want to use the tab, look in an Unicode table for something else suitable.
Another option
might be to generate XML files, for which a certain format also is accepted for import by newer MS Excel versions, and which allows a lot more options similar to .XLS format, but I don't have experience with this.
So there are various options. Depending on your requirements/application, one might be better than another.
Addition
It needs to be said that newer versions (2013+) of MS Excel don't open the CSV in spreadsheet format any more - one more speedbump in one's workflow making Excel less useful... At least, instructions exist for getting around it. See e.g. this Stackoverflow: How to correctly display .csv files within Excel 2013?
.
Working off of Jarod's solution and the issue brought up by Jeffiekins, you could modify
"May 16, 2011"
to
"=""May 16, 2011"""
I had a similar problem and this is the workaround that helped me without having to edit the csv file contents:
If you have the flexibility to name the file something other than ".csv", you can name it with a ".txt" extension, such as "Myfile.txt" or "Myfile.csv.txt". Then when you open it in Excel (not by drag and drop, but using File->Open or the Most Recently Used files list), Excel will provide you with a "Text Import Wizard".
In the first page of the wizard, choose "Delimited" for the file type.
In the second page of the wizard choose "," as the delimiter and also choose the text qualifier if you have surrounded your values by quotes
In the third page, select every column individually and assign each the type "Text" instead of "General" to stop Excel from messing with your data.
Hope this helps you or someone with a similar problem!
2018
The only proper solution that worked for me (and also without modifying the CSV).
Excel 2010:
Create new workbook
Data > From Text > Select your CSV file
In the popup, choose "Delimited" radio button, then click "Next >"
Delimiters checkboxes: tick only "Comma" and uncheck the other options, then click "Next >"
In the "Data preview", scroll to the far right, then hold shift and click on the last column (this will select all columns). Now in the "Column data format" select the radio button "Text", then click "Finish".
Excel office365: (client version)
Create new workbook
Data > From Text/CSV > Select your CSV file
Data type detection > do not detect
Note: Excel office365 (web version), as I'm writing this, you will not be able to do that.
WARNING: Excel '07 (at least) has a(nother) bug: if there's a comma in the contents of a field, it doesn't parse the ="field, contents" correctly, but rather puts everything after the comma into the following field, regardless of the quotation marks.
The only workaround I've found that works is to eliminate the = when the field contents include a comma.
This may mean that there are some fields that are impossible to represent exactly "right" in Excel, but by now I trust no-one is too surprised.
While creating the string to be written to my CSV file in C# I had to format it this way:
"=\"" + myVariable + "\""
In Excel 2010 open a new sheet.
On the Data ribbon click "Get External Data From Text".
Select your CSV file then click "Open".
Click "Next".
Uncheck "Tab", place a check mark next to "Comma", then click "Next".
Click anywhere on the first column.
While holding the shift key drag the slider across until you can click in the last column, then release the shift key.
Click the "text" radio button then click "Finish"
All columns will be imported as text, just as they were in the CSV file.
Still an issue in Microsoft Office 2016 release, rather disturbing for those of us working with gene names such as MARC1, MARCH1, SEPT1 etc.
The solution I've found to be the most practical after generating a ".csv" file in R, that will then be opened/shared with Excel users:
Open the CSV file as text (notepad)
Copy it (ctrl+a, ctrl+c).
Paste it in a new excel sheet -it will all paste in one column as long text strings.
Choose/select this column.
Go to Data- "Text to columns...", on the window opened choose "delimited" (next). Check that "comma" is marked (marking it will already show the separation of the data to columns below) (next), in this window you can choose the column you want and mark it as text (instead of general) (Finish).
HTH
Here is the simple method we use at work here when generating the csv file in the first place, it does change the values a bit so it is not suitable in all applications:
Prepend a space to all values in the csv
This space will get stripped off by excel from numbers such as " 1"," 2.3" and " -2.9e4" but will remain on dates like " 01/10/1993" and booleans like " TRUE", stopping them being converted into excel's internal data types.
It also stops double quotes being zapped on read in, so a foolproof way of making text in a csv remain unchanged by excel EVEN IF is some text like "3.1415" is to surround it with double quotes AND prepend the whole string with a space, i.e. (using single quotes to show what you would type) ' "3.1415"'. Then in excel you always have the original string, except it is surrounded by double quotes and prepended by a space so you need to account for those in any formulas etc.
(Assuming Excel 2003...)
When using the Text-to-Columns Wizard has, in Step 3 you can dictate the data type for each of the columns. Click on the column in the preview and change the misbehaving column from "General" to "Text."
This is a only way I know how to accomplish this without messing inside the file itself. As usual with Excel, I learned this by beating my head on the desk for hours.
Change the .csv file extension to .txt; this will stop Excel from auto-converting the file when it's opened. Here's how I do it: open Excel to a blank worksheet, close the blank sheet, then File => Open and choose your file with the .txt extension. This forces Excel to open the "Text Import Wizard" where it'll ask you questions about how you want it to interpret the file. First you choose your delimiter (comma, tab, etc...), then (here's the important part) you choose a set columns of columns and select the formatting. If you want exactly what's in the file then choose "Text" and Excel will display just what's between the delimiters.
(EXCEL 2007 and later)
How to force excel not to "detect" date formats without editing the source file
Either:
rename the file as .txt
If you can't do that, instead of opening the CSV file directly in excel, create a new workbook then go to
Data > Get external data > From Text and select your CSV.
Either way, you will be presented with import options, simply select each column containing dates and tell excel to format as "text" not "general".
What I have done for this same problem was to add the following before each csv value:
"="""
and one double quote after each CSV value, before opening the file in Excel. Take the following values for example:
012345,00198475
These should be altered before opening in Excel to:
"="""012345","="""00198475"
After you do this, every cell value appears as a formula in Excel and so won't be formatted as a number, date, etc. For example, a value of 012345 appears as:
="012345"
None of the solutions offered here is a good solution. It may work for individual cases, but only if you're in control of the final display. Take my example: my work produces list of products they sell to retail. This is in CSV format and contain part-codes, some of them start with zero's, set by manufacturers (not under our control). Take away the leading zeroes and you may actually match another product.
Retail customers want the list in CSV format because of back-end processing programs, that are also out of our control and different per customer, so we cannot change the format of the CSV files. No prefixed'=', nor added tabs. The data in the raw CSV files is correct; it's when customers open those files in Excel the problems start. And many customers are not really computer savvy. They can just about open and save an email attachment.
We are thinking of providing the data in two slightly different formats: one as Excel Friendly (using the options suggested above by adding a TAB, the other one as the 'master'. But this may be wishful thinking as some customers will not understand why we need to do this. Meanwhile we continue to keep explaining why they sometimes see 'wrong' data in their spreadsheets.
Until Microsoft makes a proper change I see no proper resolution to this, as long as one has no control over how end-users use the files.
I have jus this week come across this convention, which seems to be an excellent approach, but I cannot find it referenced anywhere. Is anyone familiar with it? Can you cite a source for it? I have not looked for hours and hours but am hoping someone will recognize this approach.
Example 1: =("012345678905") displays as 012345678905
Example 2: =("1954-12-12") displays as 1954-12-12, not 12/12/1954.
Hi I have the same issue,
I write this vbscipt to create another CSV file. The new CSV file will have a space in font of each field, so excel will understand it as text.
So you create a .vbs file with the code below (for example Modify_CSV.vbs), save and close it. Drag and Drop your original file to your vbscript file. It will create a new file with "SPACE_ADDED" to file name in the same location.
Set objArgs = WScript.Arguments
Set objFso = createobject("scripting.filesystemobject")
dim objTextFile
dim arrStr ' an array to hold the text content
dim sLine ' holding text to write to new file
'Looping through all dropped file
For t = 0 to objArgs.Count - 1
' Input Path
inPath = objFso.GetFile(wscript.arguments.item(t))
' OutPut Path
outPath = replace(inPath, objFso.GetFileName(inPath), left(objFso.GetFileName(inPath), InStrRev(objFso.GetFileName(inPath),".") - 1) & "_SPACE_ADDED.csv")
' Read the file
set objTextFile = objFso.OpenTextFile(inPath)
'Now Creating the file can overwrite exiting file
set aNewFile = objFso.CreateTextFile(outPath, True)
aNewFile.Close
'Open the file to appending data
set aNewFile = objFso.OpenTextFile(outPath, 8) '2=Open for writing 8 for appending
' Reading data and writing it to new file
Do while NOT objTextFile.AtEndOfStream
arrStr = split(objTextFile.ReadLine,",")
sLine = "" 'Clear previous data
For i=lbound(arrStr) to ubound(arrStr)
sLine = sLine + " " + arrStr(i) + ","
Next
'Writing data to new file
aNewFile.WriteLine left(sLine, len(sLine)-1) 'Get rid of that extra comma from the loop
Loop
'Closing new file
aNewFile.Close
Next ' This is for next file
set aNewFile=nothing
set objFso = nothing
set objArgs = nothing
Its not the Excel. Windows does recognize the formula, the data as a date and autocorrects. You have to change the Windows settings.
"Control Panel" (-> "Switch to Classic View") -> "Regional and Language
Options" -> tab "Regional Options" -> "Customize..." -> tab "Numbers" -> And
then change the symbols according to what you want.
http://www.pcreview.co.uk/forums/enable-disable-auto-convert-number-date-t3791902.html
It will work on your computer, if these settings are not changed for example on your customers' computer they will see dates instead of data.
Without modifying your csv file you can:
Change the excel Format Cells option to "text"
Then using the "Text Import Wizard" to define the csv cells.
Once imported delete that data
then just paste as plain text
excel will properly format and separate your csv cells as text formatted ignoring auto date formats.
Kind of a silly work around, but it beats modifying the csv data before importing. Andy Baird and Richard sort of eluded to this method, but missed a couple important steps.
In my case, "Sept8" in a csv file generated using R was converted into "8-Sept" by Excel 2013. The problem was solved by using write.xlsx2() function in the xlsx package to generate the output file in xlsx format, which can be loaded by Excel without unwanted conversion. So, if you are given a csv file, you can try loading it into R and converting it into xlsx using the write.xlsx2() function.
EASIEST SOLUTION
I just figured this out today.
Open in Word
Replace all hyphens with en dashes
Save and Close
Open in Excel
Once you are done editing, you can always open it back up in Word again to replace the en dashes with hyphens again.
A workaround using Google Drive (or Numbers if you're on a Mac):
Open the data in Excel
Set the format of the column with incorrect data to Text (Format > Cells > Number > Text)
Load the .csv into Google Drive, and open it with Google Sheets
Copy the offending column
Paste column into Excel as Text (Edit > Paste Special > Text)
Alternatively if you're on a Mac for step 3 you can open the data in Numbers.
(EXCEL 2016 and later, actually I have not tried in older versions)
Open new blank page
Go to tab "Data"
Click "From Text/CSV" and choose your csv file
Check in preview whether your data is correct.
In сase when some column is converted to date click "edit" and then select type Text by clicking on calendar in head of column
Click "Close & Load"
If someone still looking for answer, the line below worked perfectly for me
I entered =("my_value").
i.e. =("04SEP2009") displayed as 04SEP2009 not as 09/04/2009
The same worked for integers more than 15 digits. They weren't getting trimmed anymore.
If you can change the file source data
If you're prepared to alter the original source CSV file, another option is to change the 'delimiter' in the data, so if your data is '4/11' (or 4-11) and Excel converts this to 4/11/2021 (UK or 11-4-2021 US), then changing the '/' or '-' character to something else will thwart the unwantwed Excel date conversion. Options may include:
Tilde ('~')
Plus ('+')
Underscore ('_')
Double-dash ('--')
En-dash (Alt 150)
Em-dash (Alt 151)
(Some other character!)
Note: moving to Unicode or other non-ascii/ansi characters may complicate matters if the file is to be used elsewhere.
So, '4-11' converted to '4~11' with a tilde will NOT be treated as a date!
For large CSV files, this has no additional overhead (ie: extra quotes/spaces/tabs/formula constructs) and just works when the file is opened directly (ie: double-clicking the CSV to open) and avoids pre-formatting columns as text or 'importing' the CSV file as text.
A search/replace in Notepad (or similar tool) can easily convert to/from the alternative delimiter, if necessary.
Import the original data
In newer versions of Excel you can import the data (outlined in other answers).
In older versions of Excel, you can install the 'Power Query' add-in. This tool can also import CSVs without conversion. Choose: Power Query tab/From file/From Text-CSV, then 'Load' to open as a table. (You can choose 'do not detect data types' from the 'data type detection' options).
I know this is an old thread. For the ones like me, who still have this problem using Office 2013 via PowerShell COM object can use the opentext method. The problem is that this method has many arguments, that are sometimes mutual exclusive. To resolve this issue you can use the invoke-namedparameter method introduced in this post.
An example would be
$ex = New-Object -com "Excel.Application"
$ex.visible = $true
$csv = "path\to\your\csv.csv"
Invoke-NamedParameter ($ex.workbooks) "opentext" #{"filename"=$csv; "Semicolon"= $true}
Unfortunately I just discovered that this method somehow breaks the CSV parsing when cells contain line breaks. This is supported by CSV but Microsoft's implementation seems to be bugged.
Also it did somehow not detect German-specific chars. Giving it the correct culture did not change this behaviour. All files (CSV and script) are saved with utf8 encoding.
First I wrote the following code to insert the CSV cell by cell.
$ex = New-Object -com "Excel.Application"
$ex.visible = $true;
$csv = "path\to\your\csv.csv";
$ex.workbooks.add();
$ex.activeWorkbook.activeSheet.Cells.NumberFormat = "#";
$data = import-csv $csv -encoding utf8 -delimiter ";";
$row = 1;
$data | %{ $obj = $_; $col = 1; $_.psobject.properties.Name |%{if($row -eq1){$ex.ActiveWorkbook.activeSheet.Cells.item($row,$col).Value2= $_ };$ex.ActiveWorkbook.activeSheet.Cells.item($row+1,$col).Value2 =$obj.$_; $col++ }; $row++;}
But this is extremely slow, which is why I looked for an alternative. Apparently, Excel allows you to set the values of a range of cells with a matrix. So I used the algorithm in this blog to transform the CSV in a multiarray.
function csvToExcel($csv,$delimiter){
$a = New-Object -com "Excel.Application"
$a.visible = $true
$a.workbooks.add()
$a.activeWorkbook.activeSheet.Cells.NumberFormat = "#"
$data = import-csv -delimiter $delimiter $csv;
$array = ($data |ConvertTo-MultiArray).Value
$starta = [int][char]'a' - 1
if ($array.GetLength(1) -gt 26) {
$col = [char]([int][math]::Floor($array.GetLength(1)/26) + $starta) + [char](($array.GetLength(1)%26) + $Starta)
} else {
$col = [char]($array.GetLength(1) + $starta)
}
$range = $a.activeWorkbook.activeSheet.Range("a1:"+$col+""+$array.GetLength(0))
$range.value2 = $array;
$range.Columns.AutoFit();
$range.Rows.AutoFit();
$range.Cells.HorizontalAlignment = -4131
$range.Cells.VerticalAlignment = -4160
}
function ConvertTo-MultiArray {
param(
[Parameter(Mandatory=$true, Position=1, ValueFromPipeline=$true)]
[PSObject[]]$InputObject
)
BEGIN {
$objects = #()
[ref]$array = [ref]$null
}
Process {
$objects += $InputObject
}
END {
$properties = $objects[0].psobject.properties |%{$_.name}
$array.Value = New-Object 'object[,]' ($objects.Count+1),$properties.count
# i = row and j = column
$j = 0
$properties |%{
$array.Value[0,$j] = $_.tostring()
$j++
}
$i = 1
$objects |% {
$item = $_
$j = 0
$properties | % {
if ($item.($_) -eq $null) {
$array.value[$i,$j] = ""
}
else {
$array.value[$i,$j] = $item.($_).tostring()
}
$j++
}
$i++
}
$array
}
}
csvToExcel "storage_stats.csv" ";"
You can use above code as is; it should convert any CSV into Excel. Just change the path to the CSV and the delimiter character at the bottom.
Okay found a simple way to do this in Excel 2003 through 2007. Open a blank xls workbook. Then go to Data menu, import external data. Select your csv file. Go through the wizard and then in "column data format" select any column that needs to be forced to "text". This will import that entire column as a text format preventing Excel from trying to treat any specific cells as a date.
This issue is still present in Mac Office 2011 and Office 2013, I cannot prevent it happening. It seems such a basic thing.
In my case I had values such as "1 - 2" & "7 - 12" within the CSV enclosed correctly within inverted commas, this automatically converts to a date within excel, if you try subsequently convert it to just plain text you would get a number representation of the date such as 43768. Additionally it reformats large numbers found in barcodes and EAN numbers to 123E+ numbers again which cannot be converted back.
I have found that Google Drive's Google Sheets doesnt convert the numbers to dates. The barcodes do have commas in them every 3 characters but these are easily removed. It handles CSVs really well especially when dealing with MAC / Windows CSVs.
Might save someone sometime.
I do this for credit card numbers which keep converting to scientific notation: I end up importing my .csv into Google Sheets. The import options now allow to disable automatic formatting of numeric values. I set any sensitive columns to Plain Text and download as xlsx.
It's a terrible workflow, but at least my values are left the way they should be.
I made this VBA macro which basically formats the output range as text before pasting the numbers. It works perfectly for me when I want to paste values such as 8/11, 23/6, 1/3, etc. without Excel interpreting them as dates.
Sub PasteAsText()
' Created by Lars-Erik Sørbotten, 2017-09-17
Call CreateSheetBackup
Columns(ActiveCell.Column).NumberFormat = "#"
Dim DataObj As MSForms.DataObject
Set DataObj = New MSForms.DataObject
DataObj.GetFromClipboard
ActiveCell.PasteSpecial
End Sub
I'm very interested in knowing if this works for other people as well. I've been looking for a solution to this problem for a while, but I haven't seen a quick VBA solution to it that didn't include inserting ' in front of the input text. This code retains the data in its original form.

C# not finding space in string copied from Excel

Help! I have a list of records in Excel that I'm copying/pasting into an ASP.NET web page. From there, the C# code parses the records.
This code below works for one of the names, but not another. If, however, I copy/replace the empty space in Excel with a typed space or if I actually backspace and type the name into the webpage with the keyboard, it does work.
It's as if Excel has some odd ghost character in the file I was given for the space on this record. I've pasted in Notepad++ and showed all characters, and I don't see anything special here that's different among the records.
This one works and detects the spaces: Carolyn Bentivegna
This one does not: Allan D. Blake
if (fullName.IndexOf(" ") > -1)
Try the tabspace:
if (fullName.IndexOf("\t") > -1)
Cells copied via excel are separated by a TabSpace and Rows are separated via newlines and carriage return.

Why am I getting "�" characters?

I've written a quick-and-dirty utility to parse a text file, but in some cases it's writing out a "�" character. My utility reads from a .txt file which contains "records" in this format:
Biography
Title:George F. Kennan: An American Life
Author:John Lewis Gaddis
Kindle: B0054TVO1G
Hardcover: B007R93I1U
Paperback: 0143122150
Image link: <img src="http://images.amazon.com/images/P/B0054TVO1G.01.MZZZZZZZ.jpg" alt="Book Cover" />
...and writes out lines from that to a CSV file such as:
Biography,"George F. Kennan: An American Life","John Lewis Gaddis",B0054TVO1G,B007R93I1U,0143122150,<img src="http://images.amazon.com/images/P/B0054TVO1G.01.MZZZZZZZ.jpg" alt="Book Cover" />
...but in several cases, as mentioned, that weird character is appending itself to an author's name. In most cases where this is happening, it's what appears to be a space character in the .txt file. I'm trimming the author's name prior to writing it out to the CSV file, so it's obviously not being seen as a space, though.
When I save the text file with these characters, I get the message about non-unicode characters, etc.
What could be the cause of that? And better yet, how can I delete them with a search and replace operation? In Notepad, they are not found, so I have to delete them one-by-one.
Prior to being in the .txt file, this data was in an Open Office/.odt file, if that means anything to anyone.
BTW, I have no idea how that "stackoverflow" got into the href above; it's not in the original text I pasted in...
UPDATE
I am curious how that character got in my files. I sure didn't put it there (deliberately), any more than I added the "stackoverflow" to the URL above. Could it be that a call to Environment.Newline would add that?
Here was my process:
1) Copy and paste info from the interwebs into an Open Office/.odt file
2) Copy and past that into a text (Notepad) file
3) Open that text file programmatically and loop through it, writing to a new "csv"/.txt file.
UPDATE 2
Silly me - all I had to do was save the file (which wouldn't save those weird characters), then open it again. IOW, when I opened it today (at home, after work) those were gone.
UPDATE 3
I wrote too soon - it replaced the weird character with a question mark (a "normal" one, not a stylized one).
They are almost certainly non-breaking spaces, U+00A0 (although there are other fixed-width space characters which are also possible.) These won't be trimmed as spaces, but will be rendered as spaces if the encoding of the file matches the encoding of the output device.
My guess is that your text file is in CP-1252 (i.e., Windows default one-byte coding) but your output is being rendered as though it were UTF-8.
Normally you would type these characters as AltGr+Space. You might try that with Notepad, but no guarantees.

Categories

Resources