I am reading in a header from a file which has time fields for example Time (UTC +1). I then need to compare this with a list of stored headers to work out if the file is valid however my stored headers are used for writing and so allow flexibility on the timezones by being written like so Time (UTC {0}).
I would like to know what the best way of dealing with this in as much of a flexible statement as possible. The only way I can imagine doing it is by getting the position of the { and only comparing up to that. This is fine in this circumstance but what if I have some words after the parameter which are more important than a closing bracket.
EDIT: I would like to give some context to the problem so that I can explain better how flexible I need it. I think I possibly didn't emphasise the fact that I didn't want it to JUST work with the time field.
I am trying to write a system which is very flexible. I store a list of valid headings and then use them to find out what value to read/write to the csv file. It is very flexible and easily maintainable. I want to be able to keep it neat and flexible. I want to be able to write a function which takes in a string which has one of more parameters in it and then compare it with a value which has had the parameters filled in (Like the example with the Time header). In the future I may have a field for temperature in a particular place so my stored heading would be Temperature in {0}({1}) which when I am reading back it would be Temperature in Britain(c) or Temperature in America(f).
You could use a regex like this one :
string pattern = #"Time \(UTC \{(\+)*\d\}\)";
Regex rgx = new Regex(pattern);
Regex has a Match method you can use to check whether any string matches the pattern you provided.
Related
I´m developing a Quality Management software to show Errors of a Provisioning tool.
therefore I read all errors from an XML and group them
but there are errors like: "Forcedtime 1.08.2016 17:51:00 is in the past"
Is It possible to find Date-Values like this in a string and delete them ?
I can't work with a hard coded replace cause there are many different values for the Date Time.
Thanks for helping me
Yet another suggestion is to read the XML as such and discard the nodes you don't like/need. Then process only the nodes left.
In fact, while the Regex will do the trick for replacing something in the text, it might not be a good fit since this is structured data as opposed to bunch of data in a string format.
(0?[1-9]|[12][0-9]|3[01])\.(0?[1-9]|1[0-2])\.(\d{4}) (00|[0-9]|1[0-9]|2[0-3]):([0-9]|[0-5][0-9]):([0-9]|[0-5][0-9])
It checks for date time 1.08.2016 17:51:00.
You have to compare this date format in your string if matches then replace it
Try to learn reg exp
I'm trying to write a parser to read values from a string based on a pattern. For example:
I need to write a method so it accepts "12:04:03" and a pattern example such as "{hh}:{mm}:{ss}" and it can parse it to return the corresponding portions ("12,"04","03").
I'm not trying to actually parse time, it's just a practical example. The pattern groups can be hardcoded.
What I think I could do:
Parse the string with RegEx and then find the original content looping through the string.
While this would work, I think there might be a more efficient way or even pre-built solution in the Framework.
So, how can I solve this problem elegant and efficiently?
I have a lot of text data with different structure. I need to extract parts of these texts based on some text-based rules. I would use regular expressions but unfortunately the people who are using the application have never heard of it.
Basically the app does the following thing:
Load the data into a textbox
Type the structure of the output as a simple set of rules into another textbox
Receive the results in a 3rd textbox
Examples of data structures (I have megabytes of this data):
Label1: value1, measurement
Label2; value2; something else
Nr, value3 (comment)
...
I need some other approach that I could use instead of regular expressions. It can be extremely simple because all I need is one value from every row.
From the example above I have to obtain the following structure:
"value1, value2, value3"
Is there a simpler alternative to regex? Did someone already implement something like this?
I can also imagine that I am approaching the problem from the wrong angle, like forcing the simple user to write data extraction rules. In this case the question is transformed to something more generic like "How can build an application that lets a very simple user extract data from a separate texts?"
Edit:
I have the following simplest as possible matching implemented for them:
File content:
"Strain at break Ax2";"Unknown"
"Strain at break Ax1";"Unknown"
"Strain at break";"Unknown"
"Yield point strain";"Unknown"
"Uniform elongation";25.4087;"%"
"Tensile strength";261.323;"MPa"
"End test phase Yield point";1;"%"
"Maximum tensile force";5.22647;"kN"
Pattern:
"Tensile strength";(?<value>[^;\n]*);
"Maximum tensile force";(?<value>[^;\n]*);
Still too complex. The problem is if I start replacing the ugly part with another string to obtain for example:
"Tensile strength", [First value after]
I loose all the generic nature of the extraction because every file looks different from this one.
Take a look at the FileHelpers library. It allows runtime generation of file layouts and I think the one that would help in your example is the DelimitedClassBuilder.
In your case, I'd probably use FileHelpers to parse the record definitions into the DelimitedClassBuilder and then use the result to parse your records.
I have solved the issue by defining the rules as regular expressions. After the rules were defined I defined a wrapper rule-set that was easier to read by the users.
Ex. to extract a value from a line
Maximum amount of Sheet Drawing Force= 35.659695[kN]
I defined the regular expression
{0}=\s*(?<value>[^[\n\r]*)
then let the user define the name of the field. The {0} placeholder was then replaced with the name of the field and the regular expression applied.
I'm pretty sure it has been asked before, but I could not find anything good.
I'm trying to parse a log but having troubles with it.
At first it looked pretty easy because the log is build like this:
thing,thing,thing,thing
so I string split it on the ,
however in the value itself it is possible that a , appears, and this is where I did not know what to do anymore.
How would I successfully parse this kind of log?
Edit~~
here is an log example:
1326139200953,info,,0,"str value which may contain, ",,,0
1326139201109,info,,0,"str value which may contain, ",,,0
1326139201265,info,,0,"str value which may contain, ",,,0
1326139201999,start,,0,,,,0
1326139368296,new,F:\Dir\Dir\file.txt,1536,,0,,0
``
If your log file doesn't have field encapsulators, the fields have variable width, and the separator/delimiter can also appear in a field, then it's likely you can't program something that will work in all cases.
Can you supply an example of your log file data? It may be possible to match the parts you need with a regex.
Unfortunately I think your question is not answerable in its current state, please provide more info.
Edit: Thanks for updating the question, you do have field encapsulators (double quotes). This will make it easier!
I think there are many ways to do this. Personally i think i would carry on splitting on commas, but then loop over the resulting array, checking if the first character of any value is a double quote. If it is, then you need to join it to the array item after it. If the last character of the joined array item isn't a double quote, you need to continue joining until you've closed your opening double quote.
There's certainly a better way so you may wish to wait for another solution.
Edit 2: Give this a go and let me know how you get on:
string myRegex = #"(?<=^(?:[^""]*""[^""]*"")*[^""]*),";
string[] outputArray = Regex.Split(myStr, myRegex);
Background
I need to validate user input in some fields, where these are defining how to show time in some views.
Requirements
Time format must be expressed in Microsoft .NET way (check this MSDN Library article if you want to learn more about framework's date and time formatting: http://msdn.microsoft.com/en-us/library/8kb3ddd4.aspx)
Keep in mind I'm looking to validate the format instead of an actual time string.
For example, user may input:
HH:mm
hh:mm
ss
hh:ss
mm:ss
... and so on.
In fact, it should validate from the shortest to longest time format available.
Another point is I need to do it in client-side using JavaScript. In other words, any given regular expression by you should work in browsers JavaScript regular expressions' engine.
I'll appreciate any self-taylored one, any link or pasted expression!
Thank you in advance.
NOTE (Update)
I can't use ASP.NET validation engine, or any other. Because of project's requirements, I need to avoid that.
As far as I understand, there is no much options - sort of 20, as maximum. Why not just enumerate them all in one big regex without much special symbols? Like
'hh:mm|hh:mm:ss|yyyy-MM-dd hh:mm|<etc>'
you could than make it case sensitive to differentiate between M for month and m for minute, and for hours make it [hH], then make it [:-/] there where you allow for different separators, and lots of other similar things. But the main idea is to simply enumerate all options separated by | with just little amount of regex syntax between | and |.
What is your definition of a "valid" format string? Only once you know that can it be possible to validate a format string.
"K" is also a valid format string
"zz" is also a valid format string
"e" is also a valid format (it would fall into the "The character is copied to the result string unchanged." case)
I'm not even sure what formats would actually cause .NET .ToString() to throw an exception (if that's what you are trying to avoid).