Merged with C# Convert Relative to Absolute Links in HTML String.
If I am at a given website, with a given full URL, how can I determine the absolute path of any href or src attribute.
So if I have:
string WebsiteImAt = "http://www.w3schools.com/media/media_mimeref.asp?q=1&s=2,2#a"
//Just some random website with a sub path and a filename
string text = DownloadHTML(WebsiteImAt);
string href = "/something/somethingelse/filename.asp";
//Should go to http://www.w3schools.com/something/somethingelse/filename.asp
string href2 = "something.asp";
//Should go to http://www.w3schools.com/media/something.asp
string href3 = "something";
//Should go to http://www.w3schools.com/media/something
I'm having trouble getting my regex to work with "/blah" and just "blah" without the slashes:
String value = Regex.Replace(text, "<(.*?)(src|href)=\"(?!http)(.*?)\"(.*?)>", "<$1$2=\"" + absoluteUrl + "$3\"$4>", RegexOptions.IgnoreCase | RegexOptions.Multiline);
return value.Replace(WebsiteImAt + "/", WebsiteImAt);
I cannot alter each href/src to resolve to the correct address. How do I fix my regex to account for the three href cases above?
Related
I've been trying to make this URL a workable string in C#, but unfortunately using extra "" or "#" is not cutting it. Even breaking it into smaller strings is proving difficult. I want to be able to convert the entire address into a single string.
this is the full address:
<https://my.address.com/BOE/OpenDocument/opendoc/openDocument.jsp?iDocID=ATTPCi6c.mZInSt5o3t_Xr8&sIDType=CUID&&sInstance=Last&lsMZV_MAT="+URLEncode(""+[Material].[Material - Key])+"&lsIZV_MAT=>
I've also tried this:
string url = #"https://my.address.com/BOE/OpenDocument/opendoc/openDocument.jsp?iDocID=ATTPCi6c.mZInSt5o3t_Xr8&sIDType=CUID&&sInstance=Last&lsMZV_MAT=";
string url2 = #"+ URLEncode("" +[Material].[Material - Key]) + """"";
string url3 = #"&lsIZV_MAT=";
Any help is appreciated.
The simplest solution is put additional quotes inside string literal and use string.Concat to join all of them into single URL string:
string url = #"https://my.address.com/BOE/OpenDocument/opendoc/openDocument.jsp?iDocID=ATTPCi6c.mZInSt5o3t_Xr8&sIDType=CUID&&sInstance=Last&lsMZV_MAT=";
string url2 = #"""+URLEncode(""+[Material].[Material - Key])+""";
string url3 = #"&lsIZV_MAT=";
string resultUrl = string.Concat(url, url2, url3);
NB: You can use Equals method or == operator to check if the generated string matches with desired URL string.
This may be a bit of a workaround rather than an actual solution but if you load the string from a text file and run to a breakpoint after it you should be able to find the way the characters are store or just run it from that.
You may also have the issue of some of the spaces you've added being left over which StringName.Replace could solve if that's causing issues.
I'd recommend first checking what exactly is being produced after the third statement and then let us know so we can try and see the difference between the result and original.
You are missing the triple quotes at the beginning of url2
string url = #"https://my.address.com/BOE/OpenDocument/opendoc/openDocument.jsp?iDocID=ATTPCi6c.mZInSt5o3t_Xr8&sIDType=CUID&&sInstance=Last&lsMZV_MAT=";
string url2 = #"""+URLEncode(""+[Material].[Material - Key])+""";
string url3 = #"&lsIZV_MAT=";
I just made two updates
t&lsMZV_MAT=" to t&lsMZV_MAT="" AND
[Material - Key])+" to [Material - Key])+""
string s = #"<https://my.address.com/BOE/OpenDocument/opendoc/openDocument.jsp?iDocID=ATTPCi6c.mZInSt5o3t_Xr8&sIDType=CUID&&sInstance=Last&lsMZV_MAT=""+ URLEncode([Material].[Material - Key])+""&lsIZV_MAT=>";
Console.Write(s);
Console.ReadKey();
I need to insert some string value after the last slash. I have such string value:
string url = "http://blog.loc/blog/news/sport/slug1_slug2_slug3-slug";
I need to get this value:
"http://blog.loc/blog/news/sport/hot_slug1_slug2_slug3-slug"
So, I need to insert hot_ (for example), after the last slash. Could anyone help me?
I know you asked for regex, but it's not really necessary in my opinion.
You can just use string.Insert:
string url = "http://blog.loc/blog/news/sport/slug1_slug2_slug3-slug";
url = url.Insert(url.LastIndexOf("/") + 1, "hot_");
url now holds the value: http://blog.loc/blog/news/sport/hot_slug1_slug2_slug3-slug
Regex Method :
string url = "http://blog.loc/blog/news/sport/slug1_slug2_slug3-slug";
var matches = Regex.Matches(url, "/");
var match = matches[matches.Count - 1];
string result = url.Insert(match.Index + 1, "hot_")
This is my url which contains 4 querystrings(desc,url,img,title).
http://localhost:4385/Default?desc=Home%20Page&url=http://localhost:4385/&img=http://localhost:4385/images/ribbon-img.png&title=
I read querystrings like below,
string title = Request.QueryString["desc"];
string pageurl = Request.QueryString["url"];
string alttext = Request.QueryString["title"];
string imageurl = Request.QueryString["img"];
The output i get is:
title=Home Page&url=http://localhost:4385/&img=http://localhost:4385/images/ribbon-img.png&title="
it takes entire url to first querstring, this is not my expected output.
I expect values to all querystring variables
can anyone please help me
The URL format is incorrect i feel, because the slash / character will be sent as %2F in the query string but that was not done in your URL format.
Update:
Respose.Redirect("http://localhost:4385/Default?desc=Home%20Page&url="+Uri.EscapeDataString("http://localhost:4385/")+"&img="+Uri.EscapeDataString("http://localhost:4385/images/ribbon-img.png")+"&title=");
The problem is that you are not creating the QueryString with proper encoding. .NET framework has HttpUtility.ParseQueryString Method to simplify this problem of encoding. Try this code
//are you sure your URL doesn't have an ".aspx" extension?
var url = " http://localhost:4385/Default.aspx?";
var queryString = System.Web.HttpUtility.ParseQueryString(string.Empty);
queryString["desc"] = "Home Page";
queryString["url"] = "http://localhost:4385/";
queryString["image"] = "http://localhost:4385/images/ribbon-img.png";
queryString["title"] = "";
Response.Redirect(url + queryString.ToString());
Now the QueryString will look like this.
var urlWithQueryString = " http://localhost:4385/Default.aspx?desc=Home+Page&url=http%3a%2f%2flocalhost%3a4385%2f&image=http%3a%2f%2flocalhost%3a4385%2fimages%2fribbon-img.png&title="
Now parsing can be done using the method you tried
string title = Request.QueryString["desc"];
string pageurl = Request.QueryString["url"];
string alttext = Request.QueryString["title"];
string imageurl = Request.QueryString["image"]; //you have wrongly typed "img" here
I have two string:
string url = HttpContext.Current.Request.Url.AbsoluteUri;
//give me :
//url = http://localhost:1302/TESTERS/Default6.aspx?tabindex=2&tabid=15
And:
string path = HttpContext.Current.Request.Url.AbsolutePath;
//give me:
//path = /TESTERS/Default6.aspx
Now I want to get the string:
http://localhost:1302
So what I am thinking of is I will find the position of path in url and remove the sub-string from this position in url.
What I tried:
string strApp = url.Remove(url.First(path));
or
string strApp = url.Remove(url.find_first_of(path));
but I can't find the write way to express this idea. How can I archive my goal?
So basically you want the URL, from the start up to the beginning of your path.
You don't need to "remove" that part, only take characters up to that precise point. First, you can get that location with a simple IndexOf as it returns the position of the first character that matches your string. After this, simply take the part of url that goes from 0 to that index with Substring.
string url = "http://localhost:1302/TESTERS/Default6.aspx?tabindex=2&tabid=15";
string path = "/TESTERS/Default6.aspx";
int indexOfPath = url.IndexOf(path);
string strApp = url.Substring(0, indexOfPath); // gives http://localhost:1302
Which you can shorten to
string strApp = url.Substring(0, url.IndexOf(path));
you can also do something like below code to get the Host of URI
Uri uri =HttpContext.Current.Request.Url.AbsoluteUri ;
string host = uri.Authority; // "example.com"
Here is another option.. this doesn't require any string manipulation:
new Uri(HttpContext.Current.Request.Url, "/").AbsoluteUri
It generates a new Uri which is the path "/" relative to the original Url
You should just use this instead:
string baseURL = HttpContext.Current.Context.Request.Url.Scheme + "://" +
HttpContext.Current.Context.Request.Url.Authority;
This should not be solved using string manipulation. HttpContext.Current.Request.Url returns an Uri object which has capabilities to return the information you request.
var requestUrl = HttpContext.Current.Request.Url;
var result = requestUrl.GetComponents(UriComponents.SchemeAndServer,
UriFormat.Unescaped);
// result = "http://localhost:1302"
I have the following code which works fine but i need to replace the site address with a variable...
string url = HttpContext.Current.Request.Url.AbsoluteUri; // Get the URL
bool match = Regex.IsMatch(url, #"(^|\s)http://www.mywebsite.co.uk/index.aspx(\s|$)");
I have tried the following but it doesn't work, any ideas???
string url = HttpContext.Current.Request.Url.AbsoluteUri; // Get the URL
string myurl = "http://www.mywebsite.co.uk/index.aspx";
bool match = Regex.IsMatch(url, #"(^|\s)"+myurl+"(\s|$)");
You are missing a #:
bool match = Regex.IsMatch(url, #"(^|\s)" + myurl + #"(\s|$)");
The reason that you need the extra # is because the # applies only to the string literal immediately following it. It does not apply to the entire rest of the line.
You should also consider escaping your URL:
bool match = Regex.IsMatch(url, #"(^|\s)" + Regex.Escape(myurl) + #"(\s|$)");