parsing address "label" fields in Excel, C#, VBA, other? - c#

Someone's sent me a Word file full off address labels separated by tabs. See
I'm trying to figure out the best way to import the addresses into individual records. Probably just go with NameLine, Address1, Address2 for each one (3 fields that I can parse later).
What can I do easily with C# or VBA? Or UltraEdit?

I like Excel for things like this. Just copy the text from Word, paste it into Excel, and use the text import wizard with a tab delimiter, making sure to treat consecutive delimiters as one.
Excel can even parse it for you:
Cut and paste the columns so that it's just one long column with all the addresses. (Let's say column A)
Assuming each address record is 3 lines long, we want to get that into a format with three columns: Name, Address1, Address2.
In Cell B1, create formula =A1.
In Cell C1, create formula =A2.
In Cell D1, create formula =A3.
Select cells B1 through D3, or D4 if you have blank lines between each address record.
Copy.
Go to cell B4, or B5 if there's blank lines between each address record.
CTRL+END to select everything until the end of the data (basically, cells B5:DXX should be selected)
Paste.
Create a new record at the top with your desired fields names.
Example result:
Afterwards, you can copy the results into a new worksheet (sans formulae, so it'll just be static text), format the data however you want it, and sort the data to remove those pesky blank lines.

If all the tabs line up in Word, you should be able to Alt-Select to select individual columns, then cut & paste them into one sequential column so you just get one contiguous file of Address1,Address2,Address3,BlankLine, which should then be trivial to parse.

Related

Converting consecutive and repeated questions in an excel column into a table

I have an excel spreadsheet with repeated questions down a single column. It spans over 3000 rows deep. The data is confidential so I can only provide the example image below:
I wish to take this data and group the answers into a table so it is suitable for export into an XML file. Example Result:
I have some educational experience with c# and stream read/write but I'm sure there would be a VBA macro that could perform this much quicker. I have discovered the data is inconsistent due to unanswered questions not having a blank cell beneath. This was due to me attempting to convert the data from headings and text in word.
This is my first post; some pointers on how to write a better question would be appreciated if necessary.
I can think of a rather manual solution, instead of using VBA Macro. You'd essentially need to carry out the following steps, then if you're confident with the outcome (and the logic of identifying outliers) record a macro for ease of use later.
Use column B to "Mark" each "Start", probably by finding the string "Reference Code". For B1 that would be: =IF(TRIM(A1)="Reference Code","#","") Fill this to the end of column B, and add a "#" into the next row of column B for good measure
For column C, you'll string the values if B is blank, or reset if B has "#". C2 would be: =IF(B2="#",A2,C1&"§"&A2)Fill this to the end of column C. Now column C keeps concatenating values until it reaches a new "Reference Code" identified by "#"
For column D, you'll need to "mark" the last line of every set; this is the only line with valid data, all the others before are incomplete. D1 would be =IF(B2="#","*","")Fill this is to the end of column D
Copy column C & D and Paste Values. This locks in the concatenated strings, and the end markers.
Sort by Column D, and delete any rows without the end marker "*". Then delete column D altogether
Using "Data">"Text to Columns", split Column C up. You will use the "Delimited" setting, select "Other" and use "§" as the delimiter (it was used in Step 2).
Till here it is all automated, and you might like to record a macro with all the previous steps to run it again if you need to.
Now the columns C to L contain the relevant data, alternating between the keys "Reference Code", "Question 1", etc. and the relevant value data in between. Now sorting by each field name column (C, E, G, I, K) you should expect to always find the keyword; working from left to right, any column that does not have the correct key consistently has incorrect data. Shift the data in such columns/rows to the right until they fit into place. This effectively takes out the missing values.
Delete the field name columns, and you've got yourself a full set of data with empty values reflected.

Insert columns form one excel book to another in C#

I need to take values form one sheet in one Excel workbook and insert them into another existing workbook.
The values I need to take are the first 6 columns of the first file:
And I want to insert them at the beginning of another book like so
I've been using Spire.Xls to read values from the first sheet and I thought I could just do the same; parse the worksheet, read the values and just paste them into the other sheet, but that wouldn't work because three of the columns I want to copy have the same header "Descripcion" so my parser would only take the values form the first descripcion column and skip the other ones.
Is there any way, using Excel.Interop or maybe Spire itself to copy and paste entire columns between workbooks? Or alternately, is there any way to get all of the 3 "descripcion" values (without rewriting the title of the columns)?
VSTO might be helpful. I've done similar tasks in C#/VSTO.
Perhaps read through: Simple Example of VSTO Excel using a worksheet as a datasource

Excel RTD multiple cells

I did a single cell subscription, so when I put the formula into the cell, it updates it correctly.
Now, I'm returning an object with multiple values and I want to display all of them in Excel cells. Is it possibly to only put a formula in 1A, subscribe once, get all values at once, and then distribute the information from one object to 1A, 1B, 1C... Or is the only way to subscribe individually to each field and put an RTD formula for every cell?
I came up with a workaround using a VBA function. Create your Excel sheet, make column headings that will match the fields that you need, put a formula in your 1A cell, and run the VBA function.
The function is just a for loop over all columns in Range (number of rows is still up to), that just gets the column header value and does your magic and the rest is up to simple string manipulation of getting the formula, converting it to string and replacing $C1 to $D1 etc.
Example:
"=RTD("ProgId", , "Your arbitrary parameter here", $C1)"
Wouldn't say it is the fastest way, but it is a good solution

Create a non editable attribute in Excel cells that is still readable for a c# program

I have a very particular problem. I looked for similar problem to mine, test a lot of solution everyone proposed, but none of them is what I need.
My client need to export data sheet in excel format. Those data can be sorted, modified, rearranged, new values can be entered, some lines may disappear, some other can take their places, in short, anything can happen to those data. For example purpose, let's say that we export a list of item shown in a grocery list.
ItemID ItemName Price
Fr01 Apple 2.5
Fr02 Orange 4.0
Mt01 Beef 10.0
Mt02 Pork 8.33
Vg01 Carrot 1.25
My problem is that this data can be imported back in the software that originally created the excel to update (or add) these values in database base on the "ItemID". I already do validation if data is "correct" in value and type and interrelationality.
I tried to put a name to the range. The problem is when data is filter / sorted, the name don't follow the content, it stand still at the same position
original : (Range name is the name of the range, not an actual column)
ItemID ItemName Price || Range Name
Fr01 Apple 2.5 || data_fr01
Fr02 Orange 4.0 || data_fr02
Mt01 Beef 10.0 || data_mt01
Mt02 Pork 8.33 || data_tm02
Vg01 Carrot 1.25 || data_vg01
after sorting on ItemName:
ItemID ItemName Price || Range Name
Fr01 Apple 2.5 || data_fr01
Mt01 Beef 10.0 || data_fr02
Vg01 Carrot 1.25 || data_mt01
Mt02 Pork 8.33 || data_tm02
Fr02 Orange 4.0 || data_vg01
As you can see, all the info correctly follow, except the Range Name, so, when I try to import, I got a lot of data mismatch.
My other try was to make the NameRange an actual cell in excel. With this method, the cell follow, but can be changed, so I try to create a protected cell. Sadly, lines can't be inserted or deleted because of that. I found a workaround that consist in having names in a masked sheet, but once again, I need to synchronize sheets, which is not reliable for the same reasons mention previously.
Even worst, I must support both xls (97-2003) and xlsx.
So I'm looking for a stable workaround that will allow me to store somehow my "range name" data in the cell, making it invisible for the Excel User, but will follow the data so i can retrieve it at the right place when re-importing data.
Thanks in advance.
EDIT :
At finale, I must be able to write this property from C# application and then read back that same property with C#, and it must be compatible both excel file format, not viewable nor editable by excel user but stay with it's original value set, whatever happen to the data within the sheet except deletion (I don't mind if I just put it on the cell I wrote Apple in and not the entire range)
OK (I still think its better to add validation intelligence to the worksheet when you export but YMMV).
Try using the Range.ID string property - its not editable or visible from the Excel UI and it moves around with the cell. If the cell gets deleted it disappears. If a cell gets copied the ID property gets copied so there would be a duplicate.
It was introduced in Excel 2000 so probably won't work for Excel 97 but should be OK in all file formats for Excel 2000 to Excel 2013.
Here is some example VBA code:
Sub putids()
Dim j As Long
For j = 1 To 5
Range("a1").Offset(j - 1, 0).ID = CStr(j)
Next j
End Sub
Sub getids()
Dim j As Long
For j = 1 To 5
Debug.Print Range("a1").Offset(j - 1, 0).ID
Next j
End Sub
I think you should use some key column be it a unique name you've made up, a concatenation of the records making up your data row. Whatever. Make that as the left most column, hide it and lock it do users can't show that column or change it's contents.
Then in another worksheet, take those same values and starting in A2 paste them in.
Now in B2 enter this formula
=VLOOKUP(<this row's key value>,<Your data array in sheet1>,<column number>,FALSE)
Here is an example of how to so the fixed column/row settings
=VLOOKUP($A2,BigNamedRange,B$1,FALSE)
now Hide that sheet.
Now what you have in the first sheet is an area where your users can filter/sort/do whatever and in your second, controlled sheet, you have the data in the order you want to see it (which can be changed independently from the user's sheet).
Edit:
Click on 1: Allow Users TO Edit Ranges and set the range you want to let users edit.
Then, 2:, click Protect Sheet/Protect Workbook (which ever you need) to lock everything else.
Now your users can edit what you let them and not edit everything else
I don't see how named ranges help you.Have you thought of adding Validation code to the workbook using the before save event, so that the user cannot save data that is not valid? Or seeing how much you can do using Excel's data validation rules.Otherwise you have to read all the data and validate it later at DB update time (which is basically too late) Presumably the basic validation is that the iTemID is valid - your DB code won't care what order the data is in, and can skip empty rows etc.
Using a little of everyone suggestion and merge them.
Since any simple and normal solution isn't viable in our context and since the only possible property we can try to put something in (ID) isn't persistent and with the fact we need the client not to accidentally destroy the value considering the fact that anything may happen and will happen since there is no much restriction and the fact that we can't lock a part of the sheet without disabling line manipulation because of the side effect of the presence of a locked cell, the closest thing we were able to achieve was to insert our keys as a formatted string in column A with a weird looking formula allowing us to hide from display, then we hide the column, making it unreachable accidentally by the user.
=IF(FALSE,"our formatted string","")
Since this hidden column has data, it follows its line when sorted and trying to copy the entire line won't be possible with the fact that we select only from column B (which cause to try to insert 256 values in 255 cells) we can control a little the "false duplicate", even if not totally eliminated.
On the importer side, we just read back with a little trick comparing the formula with the value (since value is empty, only formula got our formatted data) and having a little regex to retrieve the meaning of our formatted string then doing all our validations before the actual database import.
For the rest, it will go to the training part of the user to not "delete" the data in column A, and not searching for it.
Thanks again to everyone.

Interop - Setting a range for an Excel chart to an entire row

How do I set the source data of an excel interop chart to several entire rows?
I have a .csv file that is created by my program to display some results that are produced. For the sake of simplicity let's say these results and chart are displayed like this: (which is exactly how I want it to be)
Now the problem I am having is that the number of people is variable. So I really need to access the entire rows data.
Right now, I am doing this:
var range = worksheet.get_range("A1","D3");
xlExcel.ActiveChart.SetSourceData(range);
and this works great if you only have three Persons, but I need to access the entire row of data.
So to restate my question, how can I set the source data of my chart to several entire rows?
I tried looking here but couldn't seem to make that work with rows instead of columns.
var range = worksheet.get_range("A1").CurrentRegion;
xlExcel.ActiveChart.SetSourceData(range);
EDIT: I am assuming that the cells in the data region won't be blank.
To test this,
1) place cursor on cell A1
2) press F5
3) click on "Special"
4) choose "Current Region" as option
5) click "OK"
This will select the cells surrounding A1 which are filled, which I believe is what you are looking for.
The translation of that in VBA code points to CurrentRegion property. I think, that should work.
Check Out the option Range.EntireRow I'm not 100% on how to expand that to a single range containing 3 entire rows, but it shouldn't be that difficult to accomplish.
Another thing you can do is scan to get the actual maximum column index you need (this is assuming that there are guaranteed to be no gaps in the names), then use that index as you declare your range.
Add Code
int c = 2;//column b
while(true)
{
if (String.IsNullOrEmpty(worksheet.GetRange(1,c).Value2))
{
c--;
break;
}
c++;
}
Take a column from A to D that you're sure has no empty cells.
Do some loop to find the first empty one in that column and it will be one after the last.
Range Cell = SHeet.Range["A1"]; //or another column you're sure there's no empty data
int LineOffset = 0;
while (Cell.Offset[LineOffset, 0].Value != "") //maybe you should cast the left side to string, not sure.
{
LineOffset++;
}
int LastLine = LineOffset - 1;
Then you can get Range[Sheet.Cells[1,1], Sheet.Cells[LastLine, 4]]
Out of the box here, but why not transpose the data? Three columns for Name, Height, Weight. Convert this from an ordinary range to a Table.
When any formula, including a chart's SERIES formula references a column of a table, it always references that column, no matter how long the table gets. Add another person (another row) and the chart displays the data with the added person. Remove a few people, and the chart adjusts without leaving blanks at the end.
This is illustrated in my tutorial, Easy Dynamic Charts Using Lists or Tables.

Categories

Resources