I have following value in the table.
aaaaaa 26G 2.0G 23G 8 tmp
tmpfs 506M 0 506M 0 /dev/shm
I need to store first value that is ('aaaaaa' and'tmpfs') and second value (26 and 506) in another table. I got first value by
CAST(substr(COL_1,1,InStr(COL_1,' ')-1) AS VARCHAR2(10)) col
How do I get the second value such as 26 and 506 using substring and instring.?
I would recommend regexp_substr():
select regexp_substr(col1, '[^ ]+ ', 1, 1) as first,
regexp_substr(col1, '[^ ]+ ', 1, 2) as second
This returns the value with a space at the end. I think the pattern works without the space, because regular expression matching is greedy in Oracle:
select regexp_substr(col1, '[^ ]+', 1, 1) as first,
regexp_substr(col1, '[^ ]+', 1, 2) as second
There is an optional argument to instr where you can specify the nth occurrence of a specific string being searched.
CAST(substr(COL_1,InStr(COL_1,' ',1,1)+1,InStr(COL_1,' ',1,2)-InStr(COL_1,' ',1,1)-1) AS VARCHAR2(10))
To only extract the number from this substring, use regexp_substr. This assumes letters always follow one or more numeric characters.
regexp_substr(CAST(substr(COL_1,InStr(COL_1,' ',1,1)+1,InStr(COL_1,' ',1,2)-InStr(COL_1,' ',1,1)-1) AS VARCHAR2(10)),'\d+')
Related
I'm not an expert on regex and need some help to set up one.
I'm using Powershell and its [regex] type, which is a C# class, the final objective is to read a toml file (sample data at the bottom, or use this link to regex101), in which I need to:
match some values (values between "__")
ignore comments. (a comment starts with "#")
To match the values and put them in a capture group the following regex works:
match the template value (values between "__" ):
__(?<tokenName>[\w\.]+)__
I also want to ignore the commented lines, and I came up with this:
Ignore lines that start with a comment (even if "#" is preceded by spaces or tabs):
^(?!\s*\t*#).*
The problem starts when I put them together
^(?!\s*\t*#).*__(?<tokenName>[\w\.]+)__
this expression has the following problems:
up to one match per line, the last one (ie: in the line with "Prop5 = ..." I get one match instead of two)
Comments at the end of a line are not considered (ie: line with "Prop4 = ..." has two matches instead of one)
I've also tried to
add this at the end of the expression, it should stop the match on the first occurrence of the character
[^#]
add this at the beginning, which should check if the matched string has the given char before it and exclude it
(?<!^#)
This is a sample of my data
#templateFile
[Agent]
Prop1 = "__Data.Agent.Prop1__"
Prop2 = [__Data.Agent.Prop2__]
#I'm a comment
#Prop3 = "__NotUsed__"
Prop4 = [__Data.Agent.Prop4__] #sample usage comment __Data.Agent.xxx__
Prop5 = ["__Data.Agent.Prop5a__","__Data.Agent.Prop5b__"]
I think the easier solution will be to match the given string, only if there is no "#" before it on the same line.
Is it possible?
EDIT:
The first expression proposed by #the-fourth-bird works perfectly, it just needs the multiline modifier to be specified.
The final (runnable) result looks like this in PowerShell.
[regex]$reg = "(?m)(?<!^.*#.*)__(?<tokenName>[\w.]+)__"
$text = '
#templateFile
[Agent]
Prop1 = "__Data.Agent.Prop1__"
Prop2 = [__Data.Agent.Prop2__]
Prop5 = ["__Data.Agent.Prop5a__","__Data.Agent.Prop5b__"]
#a comment
#Prop3 = "__Data.Agent.Prop3__"
Prop4 = [__Data.Agent.Prop4__] #sample usage comment __Data.Agent.xxx__
'
$reg.Matches($text) | Format-Table
#This returns
Groups Success Name Captures Index Length Value
------ ------- ---- -------- ----- ------ -----
{0, tokenName} True 0 {0} 31 20 __Data.Agent.Prop1__
{0, tokenName} True 0 {0} 62 20 __Data.Agent.Prop2__
{0, tokenName} True 0 {0} 94 21 __Data.Agent.Prop5a__
{0, tokenName} True 0 {0} 118 21 __Data.Agent.Prop5b__
{0, tokenName} True 0 {0} 194 20 __Data.Agent.Prop4__
I think you could make use of infinite repetition to check if what precedes does not contain a # to also account for the comment in Prop4
(?<!^.*#.*)__(?<tokenName>[\w.]+)__
.Net regex demo
If Prop4 should have 2 matches, you might use:
(?<!^[ \t]*#.*)__(?<tokenName>[\w.]+)__
.NET regex demo
Both expressions needs the multiline modifier to work properly.
it can be specified inline by adding (?m) at the beginning. (or by specifying it in a constructor that supports it)
(?m)(?<!^.*#.*)__(?<tokenName>[\w.]+)__
So I've got a long string of numbers and characters and I'd like to filter out a substring. The thing I'm struggling with is that I need a full match on a certain value (starting with S) but this may not be matched in another value.
Input:
S10 1+0000000297472+00EURS100 1+0000000297472+00EURS1023P 1+0000000816072+00EUR
The input is exactly like this.
Breakdown of input:
S10 1+0000000297472+00EUR
Every part starts with a tag S and ends with EUR
There are spaces in between because every part has a fixed length
=>
index 0 : tag 'S' with length 1
index 1 : code with length 7
index 8 : numbertype with length 1
index 9 : sign with length 1
index 10 : value with length 13
index 23 : sign with length 1
index 24 : exponent with length 2
index 26 : unit with length 3
I need to match on for example S10 and I only want this substring till EUR. I don't want it to match on S100 or S1023P or any other combination. Only on exactly S10
Output:
S10 1+0000000297472+00EUR
I'm trying to use Regex to find my match on 'S + code'. I'm doing a full match on my search query and then as soon as anything follows I don't want it anymore. But doing it like this also discards the actual match as after the S10 the value will follow which will match with [^\d|^\D])+\w
foreach (var field in fieldList)
{
var query = "S" + field.BallanceCode;
var index = Regex.Match(values, Regex.Escape(query) + #"([^\d|^\D])+\w").Index;
}
For example when looking for S10
needs to match:
S10 1+0000000297472+00EUR
may not match:
S10/15 1+0000001748447+00EUR
S1023P 1+0000000816072+00EUR
S10000001+0000000546546+00EUR
Update:
Using this code
var index = Regex.Match(values, Regex.Escape(query) + #"\p{Zs}.*?EUR").Index;
wil yield S10, S10/15, etc when looked for. However looking for S1000000 in the string doesn't work because there is no whitespace between the code and 1+
S10000001+0000000546546+00EUR
For example when looking for S1000000
needs to match:
S10000001+0000000297472+00EUR
may not match:
S10 1+0000001748447+00EUR
S1023P 1+0000000816072+00EUR
S10/15 1+0000000546546+00EUR
You can use a regex that requires a space (or whitespace) to appear right after the field.BallanceCode:
var index = Regex.Match(values, Regex.Escape(query) + (field.BallanceCode.Length < 7 ? #"\p{Zs}" : "") + ".*?EUR").Index;
The regex will match the S10, then any horizontal whitespace (\p{Zs}), then any 0 or more characters other than a newline (as few as possible due to *?) up to the first EUR.
The (field.BallanceCode.Length < 7 ? #"\p{Zs}" : "") check is necessary to support a 7-digit BallanceCode. If it contains 7 digits or more, we do not check if there is a whitespace after it. If the length is less than 7, we check for a space.
So you just want the start (S...) and end (...EUR) of each line and skip everything in between?
^([sS]\d+).*?([\d\+]+EUR)$
http://regexr.com/3c1ob
My Requirement is that
My first two digits in entered number is of the range 00-32..
How can i check this through regex in C#?
I could not Figure it out !!`
Do you really need a regex?
int val;
if (Int32.TryParse("00ABFSSDF".Substring(0, 2), out val))
{
if (val >= 0 && val <= 32)
{
// valid
}
}
Since this is almost certainly a learning exercise, here are some hints:
Your rexex will be an "OR" | of two parts, both validating the first two characters
The first expression part will match if the first character is a digit is 0..2, and the second character is a digit 0..9
The second expression part will match if the first character is digit 3, and the second character is a digit 0..2
To match a range of digits, use [A-B] range, where A is the lower and B is the upper bound for the digits to match (both bounds are inclusive).
Try something like
Regex reg = new Regex(#"^([0-2]?[0-9]|3[0-2])$");
Console.WriteLine(reg.IsMatch("00"));
Console.WriteLine(reg.IsMatch("22"));
Console.WriteLine(reg.IsMatch("33"));
Console.WriteLine(reg.IsMatch("42"));
The [0-2]?[0-9] matches all numbers from zero to 29 and the 3[0-2] matches 30-32.
This will validate number from 0 to 32, and also allows for numbers with leading zero, eg, 08.
You should divide the region as in:
^[012]\d|3[012]
if(Regex.IsMatch("123456789","^([0-2][0-9]|3[0-2])"))
// match
I'm trying to read column values from this file starting at the arrow position:
Here's my error:
I'm guessing it's because the length values are wrong.
Say I have column with value :"Dog "
with the word dog and a few spaces after it. Do I have to set the length parameter as 3 (for dog) or can I set it as 6 to accommodate the spaces after Dog. This because each column length is fixed. As you can see some words are smaller than others and in order to be consistent I just want to set length as max column length (ex: 28 is length of 3rd column of my file but not all 28 spots are taken up everytime - ex: the word client is only 6 characters long
Robert Levy's answer is correct for the issue you're seeing - you've attempted to pull a substring from a string with a starting position that is greater than the length of the string.
You're parsing a fixed-length field file, where each field has a certain amount of characters, whether or not it uses all of them, and the pos and len arrays are intended to define those field lengths for use with Substring. As long as the line you're reading matches the expected field starts and lengths, you will be ok. As soon as you come to a line that doesn't match (for example, what appears to be the totals line - 0TotalRecords: 3,390,315) the field length definitions you've been using won't work, as the format has changed (and the line length may not even be the same).
There are a couple of things I would change to make this work. First, I would change your pos and len arrays so that they take the entirety of the field, not part of it. You can use Trim() to get rid of any leading or trailing blanks. As defined, your first field will only take the last number of the Seq# (pos 4, len 1), and your second field will only take the first 5 characters of the field, even though it appears to have space for ~12 characters.
Take a look at this (it's hard to be exact working from the picture, but for purposes of demonstration it will work):
1 2 3 4
01234567890123456789012345678901234567890
Seq# Field Description
3 BELNR ACCOUNTING DOCUMENT NBR
The numbers are the position of each charcter in the line. I would define the pos array to be the start of the field (0 for the first field, and then the position of the first letter of the field heading for each field after that), so you would have:
Seq# = 0
Field = 6
Description = 18
The len array would hold the length of the field, which I would define as the amount of characters up to the beginning of the next field, like this:
Seq# = 6
Field = 12
Description = 28 (using what you have as it is hard to tell
This would make your array initialization the following:
int[] pos = new int[3] { 0, 6, 18 };
int[] len = new int[3] { 6, 12, 28 };
If you wanted the fourth field, it would start at position 36 (pos 18 + len 28 = 36).
The second thing is I would check in the loop to see if the Total Records line is there, and skip that line (most likely it's the last line):
foreach (string line in textBox1.Lines)
{
if (!line.Contains("Total Records"))
{
val[j] = line.Substring(pos[j], len[j]).Trim();
}
}
Another way to do this would be to modify the original query and add a TakeWhile clause to it to only take lines until you hit the Total Records one:
string[] lines = File.ReadAllLines(ofd.FileName).Skip(8)
.TakeWhile(l => !l.Contains("Total Records")).ToArray();
The above would skip the first 8 lines and take all the remaining lines up to, but not including, the first line to contain "Total Records" in the string.
Then you could do something like this:
string[] lines = File.ReadAllLines(ofd.FileName).Skip(8)
.TakeWhile(l => !l.Contains("Total Records")).ToArray();
textBox1.Lines = lines;
int[] vale = new int[3];
int[] pos = new int[3] { 0, 6, 18 };
int[] len = new int[3] { 6, 12, 28 };
foreach (string line in textBox1.Lines)
{
val[j] = line.Substring(pos[j], len[j]).Trim();
}
Now you don't have to check for the "Total Records" line.
Of course, if there are other lines in your file, or there are records after the "Total Records" line (which I rather doubt) you'll have to handle those cases as well.
In short, the code for pulling out the substrings will only work for lines that match that particular format (or more specifically, have fields that match those positions/lengths) - anything outside out of that will either give you incorrect values or throw an error (if the start position is greater than the length of the string).
that exception is complaining about the first parameter which suggests that your file contains a row that is < 18 characters
Declare #CustTotalCount as int
Declare #CustMatchCount as int
select #CustTotalCount = count(*) from ENG_CUSTOMERTALLY
select #CustMatchCount = count(*) from Task where MPDReference in(
select ENG_CUSTOMERTALLY_CUSTOMERTASKNUMBER from dbo.ENG_CUSTOMERTALLY)
if(#CustTotalCount>#CustMatchCount)
select distinct
substring(ENG_CUSTOMERMYCROSS_MYTECHNIC_TASK_NO, charindex('-', ENG_CUSTOMERMYCROSS_MYTECHNIC_TASK_NO)
+ 1, 1000)
from dbo.ENG_CUSTOMERMYCROSS where
ENG_CUSTOMERMYCROSS_CUSTOMER_NUMBER in(
select ENG_CUSTOMERTALLY_CUSTOMERTASKNUMBER from ENG_CUSTOMERTALLY1
except
select MPDReference from Task )
I can convert
- A320-200001-01-1(1)
- A320-200001-02-1(2)
- A320-200001-01-1(2)
- A320-200001-01-1(1)
- A320-200001-01-1(2)
- A320-200001-02-1(1)
TO
- 200001-01-1(1)
- 200001-02-1(2)
- 200001-01-1(2)
- 200001-01-1(1)
- 200001-01-1(2)
- 200001-02-1(1)
But I need to :
- 200001-01-1
- 200001-02-1
- 200001-01-1
- 200001-01-1
- 200001-01-1
- 200001-02-1
How can I do that in SQL and C#?
Is the pattern always the same, if so you could just use SUBSTRING to pull out the bit you want.
EDIT: To take in additional stuff asked in How can i use substring in SQL?
You could
SELECT DISTINCT SUBSTRING(....) FROM ...
as answered above, use the SUBSTRING method like you are but use a length of 11 instead of 1000 as long as the data is always in the format you show above.
In C# it would be:
string s = "A320-20001-01-1(1)";
string result = s.Substring(s.IndexOf('-'), 11);
again this is assuming the part you want is always 11 characters. Otherwise if it is always the first '(' you want to end before, you the IndexOf method/function again to find the end index and subtract the first index
Try substring and len, this sample cuts first 6 and last 4 (4 = 10-6) chars
declare #var varchar(50)
set #var = 'A320-200001-01-1(1)
select substring(#var, 6, len(#var) - 10)
output: 200001-01
In c#, functions are similar, exept zero-based index:
string var = "A320-200001-01-1(1)";
var = var.Substring(5, var.Length - 8);
Console.WriteLine(var);
Here's a technique that uses PATINDEX, which can use wild cards.
SUBSTRING(ENG_CUSTOMERMYCROSS_MYTECHNIC_TASK_NO,
PATINDEX('%[0-9]%', ENG_CUSTOMERMYCROSS_MYTECHNIC_TASK_NO),
PATINDEX('%(%', ENG_CUSTOMERMYCROSS_MYTECHNIC_TASK_NO)
- PATINDEX('%[0-9]%', ENG_CUSTOMERMYCROSS_MYTECHNIC_TASK_NO)
)
The start for your substring is the position of the first numeric value (%[0-9]%). The length value is the position of the first parenthesis ('%(%') less the starting position.