URL Rewriting from Winforms or console application - c#

I'm importing classic ASP pages into a new Sitefinity installation. Unfortunately, the existing site makes extensive use of URL rewriting via Helicon ISAPI Rewrite 3.
I'm generating the list of pages that need to be imported by crawling the navigation menus in the old site. These are, unfortunately, not dynamically generated from any sort of central repository, so the best way I've found to build the site hierarchy is to crawl the site.
When creating page nodes in the Sitefinity nav hierarchy to hold the content from the old pages, I need to be able to create the new pages at a location roughly equivelant to their location in the file system in the old site. However, the rewrite rules make this difficult to determine. For instance, I may get a link form parsing the old HTML like:
http://www.mysite.com/product_name
which is rewritten (not redirected) to
http://www.mysite.com/products/product_name/product_root.asp
I need a way to get the second url from the first. The first thing that comes to mind is to somehow use the .htaccess file to parse the URLs, get the result and use that for the rest of the import process.
Is there a way to do this from a Winforms app without having to involve a web server? I realize that I could modify one of the ASP includes, such as the page footer, to emit a comment containing the rewritten URL of each page, but I'd rather not make unnecessary changes to the existing code if it can be avoided.
Update
For example,
http://www.keil.com/arm/
rewrites to
http://www.keil.com/products/arm/mdk.asp

Related

Serve canned offline web content using the Web Browser control

I'm developing a C# replacement for a legacy VB app for my company. The front end is basically a Web Browser control inside of a Windows form, serving offline content which is sometimes altered to include the user's data. Because there are 100 or more web files in the legacy app, we are going to reuse the web UI from the old application with a new C# wrapper around it, modifying them as needed.
My questions are about how to store and deliver the web content.
Does it make sense to copy the web files to a temporary folder and point the Web Browser control to the file:// address of the temporary folder?
Is there some kind of pre-built offline-friendly server framework that makes more sense than copying the files to a temporary folder?
I have the web source files in my project as resources, but I'm not sure if that is appropriate for my uses. Is it?
The legacy VB implementation alters the web files to inject data using Substring methods; it searches for magic strings and replaces them with the appropriate data. That code smells pretty bad, is there a better, more native data injection strategy I should look at?
Some background:
The data is presented using HTML\CSS\JS and also sometimes XSL.
The browser delivers content that is available at compile time.
I'm going to have to handle some events using c# code when users click on buttons of the page.
I'm free to choose whatever approach is necessary to implement the application.
Hosting
I would probably avoid using a temporary location for the web content it just seems a little crude. If there is no internal linking between your html pages and all the css/js is embedded in one file it may be easier to just use the WebBrowser.DocumentText property.
Another option I have successfully used as a lightweight embedded web server is logv-http, it has a pretty easy to configure syntax. If you want to configure against anything other than localhost it does require administrator privileges but it sounds like everything will be local.
var server = new Server("localhost", 13337);
server.Get("http://localhost:13337" ,(req, res) => res.Write("Hello World!"));
server.Start();
Templating
I think the string replaces aren't necessarily bad depends how many there are and how complicated they are trying to be, but for simple find replace it shouldn't be too hard to manage. If there are lots of replaces wrapping them into a RegEx should help performance.
Storing the web content as embedded resources is probably how I would go that way you can read them out at run-time do you pre-processing and then return either via the the web server method or direct into the DocumentText.

Migrating Public Website Slowly

I am currently working on a website that was built on C# from the 2003 period using server controls, javascript without libraries like the modern age a lack of a data access layer and plenty of spaghetti code.
We have decide due to the sheer size of the web site we will have to migrate web pages peices at time.
The problem is we have links, navigation and menus that need to point from an old domain where the legacy pages are to the new domain where our new MVC 4, BootStrap and clean greenfield rewrites of these legacy pages are being created. The problem is also that the new web pages will have links, navigation and menus that will have to point back to the old site as well.
I know I can create 302, I can use URL rewriting even.
My concern is that all developers will need to keep track of links both in the massive legacy website to the new website and update the urls manually.
Is there a simple way of migrating a website slowly?
Is there an approach I should research to handling this?
Should I stop snivling and just tell everyone on my team to keep track of the links as they go along and use something like wget on the legacy site to find all the links?
I would create a central repository for all the links, an XML file would do nicely, where both new and legacy sites would refer to get the URLs for the links.
Yes, you would need to change all links in both new and legacy to use this repository, but the upside is that once a page has been changed you can just change it's URL in the repository and all the links in both sites would now change.

Clone intranet site, but replace content pane with own content

I'm working at a small company within a rather large company, where I don't really have control over our intranet. I have built a little site/page, and I want it to style exactly like the intranet pages.
I know I can download the stylesheets and start hacking away, but I need the links and the menu's to be up to date.
I'm working with asp.net mvc 2 here, but I've no idea how to go further from here. Thoughts?
You will need to copy the CSS etc.
About the menu - you will need to do the fallowing
use WebRequest for getting the new data, Use Html Agility Pack for parsing the page, And use XPath for getting the relevant data - I will recommend using caching for this

Home/Landing screen design for a website in asp.net

I have an web based application. The content for the Home page has been currently mentioned in the HTML code for the Home page using , and tags. To change the content anytime in future, it needs to be changed in the HTML code. :(
Is there a way that we can pick up the content from some external place and get it reflected through the website. This ways, any change if required can be made at the external location without referring to the application's code.
Please advise if there is any solution for it.
Thanks.
You can
Use a database
Include external files using Server Side Includes
Read external files and write their contents and an alternative method
Sounds like you're looking for a Content Management System (CMS), which will allow your content editors access to modify only specific blocks of a page that you specify.
There are a ton out there to do what you want, so you don't have to start from scratch. Just Google 'CMS'.
Although I haven't used it myself, DotNetNuke is a popular one these days and has a free version.

C# code for saving an entire web page? (with images/formatting)

I've been struggling to find an exmample of some C# code (I'm using C# Visual Studio 2008 Express) that can programmatically save an entire web page (given a URL) including the images and formatting (e.g. CSS). The intention is that in a subsequent phase I'd ship this off (not sure how yet) so it could be viewed later via a browser.
Is there an example of the most simple approach (leveraging the .NET Framework methods) to save an entire web page? Saving as one page with a subdirectory for images, or otherwise. Basically the same as what you get with browsers when you say "save entire web page".
The simplest way is probably to add a WebBrowser Control to your application and point it at the page you want to save using the Navigate() method.
Then, when the document has loaded, call the ShowSaveAsDialog method. The user can then save the page as a single file, or a file with images in a subdirectory.
[Update]
Having now noticed "programatically" in your question, the above approach is not ideal as it requires either user involvement or delving into the Windows API to send input using SendKeys or similar.
There is nothing built-in to the .NET Framework that does all of what you ask.
So my approach revised would be:
Use System.NET.HttpWebRequest to get the main HTML document as a string or stream (easy).
Load this into a HTMLAgilityPack document where you can now easily query the document to get lists of all image elements, stylesheet links, etc.
Then make a separate web request for each of these files and save them to a subdirectory.
Finally update all relevent links in the main page to point to the items in the subdirectory.
In effect you would be implementing a very simple web browser. You may run into issues with pages that use JavaScript to dynamically alter or request page content, but for most pages this should give acceptable results.
From code Project: ZetaWebSpider
It's definitely not elegant, but you could navigate a System.Windows.Forms.WebBrowser to the URL and then call its ShowSaveAsDiagog() method to save the page.

Categories

Resources