Find extension of file from URL - c#

I want to find what is the extension of file in current URL.
I have used
http://tol.imagingtech.com/eIL/cgi-bin/eIL.pl?ACTION=get_trans_doc&DOC=ALL&CAB_NAME=TOL_AP_2012&WHSE=18&PAYEE=23003655&INV=01235770
The extension of this file is (.pdf) how can i get this. Sometimes it will be (.doc,.txt,.jpeg) so i want the exact one.
Following is the code which i used to retrieve file extension
var extFile = Document.DocumentFilePath.Split('.');
return "Backup document." + extFile[extFile.Length-1].Trim().ToLower();
It works fine for normal local path but it fails to retrieve extension which DocumentFilePath is url.

I think that there is no way to get the file type without actually getting it.
You can get the information in the response header once your request is completed.
Content-Type: image/jpeg
You can do it in C# using WebClient
WebClient client = new WebClient();
var url = "http://tol.imagingtech.com/eIL/cgi-bin/eIL.pl?ACTION=get_trans_doc&DOC=ALL&CAB_NAME=TOL_AP_2012&WHSE=18&PAYEE=23003655&INV=01235770";
string data = client.DownloadString(url);
string contentType = client.Headers["Content-Type"];

Do a HEAD request to the URL and take a look at the Content-Disposition: attachment; filename=FILENAME header if that's being used.

To find the content type, take it from the Http response as follows:
byte[] myDataBuffer = webClient.DownloadData(fileAbsoluteUrl);
string contentType = webClient.ResponseHeaders["Content-Type"];

Related

Downloading file with UTF-8 (Thai language) filename in ASP.net core MVC

I have create a function that use to download a file and it's work properly except filename. When I download a file with Thai language name, its name turns into html entities.
For example:
Original filename: ไฟล์_1_ข้อมูลลูกค้า__Customer_Information_.xml
Saved filename: "ไฟล์_1_ข้อมูลลูกค้า__Customer_Information_.xml".
How can I save the file with an original name? Here is my code:
[HttpGet]
public async Task<IActionResult> DownloadOtherFile(string id, string filename)
{
string trueFileName = HttpUtility.HtmlDecode(filename);
var path = Path.Combine(Directory.GetCurrentDirectory(), "wwwroot", filename);
HttpClient client = new HttpClient
{
BaseAddress = new Uri(option.ApiBaseUrl)
};
try
{
client.DefaultRequestHeaders.Accept.Clear();
client.DefaultRequestHeaders.Accept.Add(new MediaTypeWithQualityHeaderValue("application/json"));
client.DefaultRequestHeaders.Authorization = new AuthenticationHeaderValue("Bearer", HttpContext.Session.GetString("token"));
HttpResponseMessage response = await client.GetAsync("/file/other/job/" + id + "/" + trueFileName);
var content = await response.Content.ReadAsStreamAsync();
return File(content, "APPLICATION/octet-stream", trueFileName);
}
catch (Exception)
{
//error
}
finally
{
client.Dispose();
}
return null;
}
Long story short, header values accept only ISO-8859-1 characters so non-ASCII characters are always encoded. The client should be able to recognize the encoding. If it doesn't, it's a problem with the client. ASP.NET Core does follow the standard as the source code shows.
The Content-Type header doesn't accept a filename. The filename is specified in the Content-Disposition header.
The correct way to return a file with a specific name is to use the File(Stream,string,string) method, or pass the file name as part of the Content-Disposition header.
You should replace:
return File(content, "APPLICATION/octet-stream, trueFileName");
with :
return File(content, "application/octet-stream", trueFileName);
Update
I can't reproduce any problem. I created a new ASP.NET Core MVC application whose Home/Index method is just :
public IActionResult Index()
{
var trueFileName = "ไฟล์_1_ข้อมูลลูกค้า__Customer_Information_.xml";
var bytes=Encoding.UTF8.GetBytes("Hello");
return File(bytes, "APPLICATION/octet-stream", trueFileName);
}
Browsing to that URL downloaded a file whose name is
ไฟล์_1_ข้อมูลลูกค้า__Customer_Information_.xml
I tested this with Chrome, Firefox and Edge Chromium.
The name wasn't HTML encoded, it was percent-encoded the way it should be. The raw HTTP response was :
HTTP/1.1 200 OK
Date: Tue, 06 Aug 2019 15:13:54 GMT
Content-Type: APPLICATION/octet-stream
Server: Kestrel
Content-Length: 5
Content-Disposition: attachment; filename=_____1_______________Customer_Information_.xml; filename*=UTF-8''%E0%B9%84%E0%B8%9F%E0%B8%A5%E0%B9%8C_1_%E0%B8%82%E0%B9%89%E0%B8%AD%E0%B8%A1%E0%B8%B9%E0%B8%A5%E0%B8%A5%E0%B8%B9%E0%B8%81%E0%B8%84%E0%B9%89%E0%B8%B2__Customer_Information_.xml
Hello
Update 2
The filename is passed from the client as a URL parameter and is probably encoded twice. That's the only way HtmlDecode would produce another HTML encoded string.
This call :
var fileName="ไฟล์_1_ข้อมูลลูกค้า__Customer_Information_.xml";
var actualName=HttpUtility.HtmlDecode(fileName);
Returns :
ไฟล์_1_ข้อมูลลูกค้า__Customer_Information_.xml
I suspect the string was encoded twice.
To troubleshoot such problems a quick first step is to use a debugging proxy like Fiddler to capture and inspect HTTP requests and responses on the fly. This doesn't require any changes on the server or client.
A similar tool is the Network tab in the Developer Tools of all modern browsers.
Another option is to change the web application's logging configuration to log the requests and responses. This should only be done during testing or investigating issues though, as it produces a lot of text and slows down the application

WebRequest response content

I'm trying to find out response content of the given url using HttpWebRequest
var targetUri = new Uri("http://www.foo.com/Message/CheckMsg?msg=test");
var webRequest = (HttpWebRequest)WebRequest.Create(targetUri);
var webRequestResponse = webRequest.GetResponse();
The above code always returns the home page (http://www.foo.com) content. I was expecting http://www.foo.com/Message page content. something wrong or am I missing something?
Is the CheckMsg is an html or php file? When I'm accessing websites using webrequest I always have to use the extension. Otherwise the website will think it's a folder. I would recommend trying to add that.
var targetUri = new Uri("http://www.foo.com/Message/CheckMsg.html?msg=test");

Redirect url to obtain the direct url

I did an application which parses an html document and then obtains some urls, the problem is the urls only can be downloaded directly from the navigator.
In VB.NET or C#, how I could redirect this url to obtain a direct link for later paste the link to download it in a Download Manager?
dim url as string = "http://m.mrtzcmp3.net/get.php?singer=Madonna&song=Like%20A%20Virgin%20&size=5242104&ids=687474703a2h2h63733434303876342g766s2g6f652h75323237363831362h617564696h732h3132323564303466333839622g6f7033"
I need to say that I'm not much experimented with http things, maybe I'm wrong and the url has anything to redirect or something similar fault, please just say me how can I redirect that kind of urls or If I'm wrong.
UPDATE:
Tried this, but I get the same url without any changes:
Dim url As String = _
"http://m.mrtzcmp3.net/get.php?singer=Madonna&song=Like%20A%20Virgin%20&size=5242104&ids=687474703a2h2h63733434303876342g766s2g6f652h75323237363831362h617564696h732h3132323564303466333839622g6f7033"
Dim request As HttpWebRequest = DirectCast(HttpWebRequest.Create(url), HttpWebRequest)
request.AllowAutoRedirect = True
Dim response As HttpWebResponse
Dim resUri As String
response = request.GetResponse
resUri = response.ResponseUri.AbsoluteUri
MsgBox(resUri)
UPDATE 2:
In the answer from here HttpWebRequest Login data Then Redirect
He says
If the redirect is handled transparently, the _response.ResponseURI
will contain the address it redirected to. If not, you have to read
the redirect header and decide yourself whether or not to request the
new page.
so...if I need to do thatm, how I can do that?
UPDATE 3:
DownloadThemAll plugin for Firefox can obtain the direct urls... as you can see all the urls finishes with an .mp3 file extension, that's what I need
To my knowledge, the url
http://m.mrtzcmp3.net/get.php?singer=Madonna&song=Like%20A%20Virgin%20&size=5242104&ids=687474703a2h2h63733434303876342g766s2g6f652h75323237363831362h617564696h732h3132323564303466333839622g6f7033
IS the direct url, a direct file url does not need to have the filetype in it.
you can download the file using
string url = "http://m.mrtzcmp3.net/get.php?singer=Madonna&song=Like%20A%20Virgin%20&size=5242104&ids=687474703a2h2h63733434303876342g766s2g6f652h75323237363831362h617564696h732h3132323564303466333839622g6f7033"
WebClient wc = new WebClient();
wc.DownloadFile(url, fileName);
you can get the fileName (Madonna-Like A Virgin -www.mrtzcmp3.net.mp3) by using
HttpWebRequest myHttpWebRequest = (HttpWebRequest)HttpWebRequest.Create(url);
string header = myHttpWebResponse.Headers.ToString();
fileName = header.Remove(0, header.IndexOf("filename=")+10);
fileName = fileName.Remove(fileName.IndexOf('"'));
that is untested, but it should work.
edit: I think this does what you want, but I may have misunderstood your question
you can perform a web request using web client to get the content (url) from that url, then you just need to perform the redirect.
Use an HttpWebRequest and use the AllowAutoRedirect=true to get the direct link and download the file.
Can you try to paste the URL to an URl shortener like tinyUrl or BitLy? Maybe there is a shortener Service that provides an API?
The file then will be downloaded at: http://tinyurl.com/phzhxsr
You will never get a direct URL from the site owner because the URL is dynamicaly parsed and the file is send with the retrun datastream, not by downloading a specific URL.

Artifactory REST Deploy screwing up upload

I am trying to deploy artifacts to artifactory using their REST API, however all my files end up having
-------------------------------28947758029299 Content-Disposition: form-data; name="test.txt"; filename="new2.txt" Content-Type:
application/octet-stream
appended to the file. Here is my code (keep in mind this is only me testing the concept...the code will be cleaned after I get a success)
var uriString = "artifactoryuri";
var uri = new Uri(uriString);
var credentialCache = new CredentialCache{{uri, "Basic",new NetworkCredential("UN", "PW")}};
var restClient = new RestClient(uriString);
var restRequest = new RestRequest(Method.PUT){Credentials = credentialCache};
restRequest.AddFile("test.txt", #"pathto\new2.txt");
var restResponse = restClient.Execute(restRequest);
How can I fix this? Is it because it is a text file and artifactory tends to store executables and such? If so, I can live with that. This will be used to upload chm files currently.
This is caused by the AddFile method - RestSharp will create a multipart/form request by default. I could not find a good solution for preventing this behavior, although many people ask about it. You can take a look at Can RestSharp send binary data without using a multipart content type? and PUT method and send raw bytes

pass local path to HttpWebRequest

I need to pass local path to HttpWebRequest in c#. i have test.xml in my c drive and i need get that xml file in HttpWebRequest. but it throws exception in
HttpWebRequest rqst = (HttpWebRequest)HttpWebRequest.Create(Uri.EscapeUriString(urlServ))
line "Invalid URI: The Authority/Host could not be parsed."
my coding->
string urlServ = "file:\\c:\\test.xml";
try
{
HttpWebRequest rqst = (HttpWebRequest)HttpWebRequest.Create(Uri.EscapeUriString(urlServ));
rqst.KeepAlive = false;
}
catch{}
I believe a file: URI is supposed to be created with forward-slashes, not back slashes. So, use this:
string urlServ = "file:///c:/test.xml";
I noticed when I typed it into my browser with backslashes, FF converted it to forward slashes for me.
You should use WebRequest.Create(uri) - this will automatically create the right object based on the URI type (e.g. file, http, etc). Now you can use the same code for real web pages or local test files.
I saw this in the documentation of FileWebRequest:
Do not use the FileWebRequest constructor. Use the WebRequest.Create
method to initialize new instances of the FileWebRequest class. If the
URI scheme is file://, the Create method returns a FileWebRequest
object.

Categories

Resources