I would like to create multiple Cache Profiles in my Web.Config for different types of pages in my web application. For example, data forms do not change often so I would increase the Cache-Control max-age. And for other pages, more frequent requests to the proxy server or origin server will be made, so Cache-Control maxage and s-maxage values will be set differently.
I am able to set the max-age with no problem. But have difficulty looking for solutions on the internet over setting the s-maxage on the web.config.
<caching>
<outputCacheSettings>
<outputCacheProfiles>
<add name="Cache1Hour" duration="30" varyByParam="none" />
</outputCacheProfiles>
</outputCacheSettings>
</caching>
My last resort is to manually add maxage and s-maxage to the Action methods.
Response.Cache.SetMaxAge(maxAge);
Response.Cache.SetProxyMaxAge(proxyMaxAga);
Any advice on the best way to do this? Ideally, I would like to use an attribute to decorate my Action methods.
Related
I have a web-application for the company's inner needs. However, customers should have access only by some specific URL sent by company.
So, to make things short, site can be accessed by everyone, but only by typing concrete URL. No search engine allowed to index any page of the site.
How can I reach it?
I found an approach based on robots.txt, but could it be implemented in web.config file?
There is a Web.Config solution as well, based on the User Agent. It uses the IIS rewrite module. But it is not foolproof since User Agents can be easly changed. Note that I did not create the list myself but found it recently on the web, but cannot seem to find the article anymore.
<rewrite>
<rules>
<rule name="Abuse User Agents Blocking" stopProcessing="true">
<match url=".*" ignoreCase="false" />
<conditions logicalGrouping="MatchAny">
<add input="{HTTP_USER_AGENT}" pattern="^.*(1Noonbot|1on1searchBot|3D_SEARCH|3DE_SEARCH2|3GSE|50.nu|192.comAgent|360Spider|A6-Indexer|AASP|ABACHOBot|Abonti|abot|AbotEmailSearch|Aboundex|AboutUsBot|AccMonitor\ Compliance|accoona|AChulkov.NET\ page\ walker|Acme.Spider|AcoonBot|acquia-crawler|ActiveTouristBot|Acunetix|Ad\ Muncher|AdamM|adbeat_bot|adminshop.com|Advanced\ Email|AESOP_com_SpiderMan|AESpider|AF\ Knowledge\ Now\ Verity|aggregator:Vocus|ah-ha.com|AhrefsBot|AIBOT|aiHitBot|aipbot|AISIID|AITCSRobot|Akamai-SiteSnapshot|AlexaWebSearchPlatform|AlexfDownload|Alexibot|AlkalineBOT|All\ Acronyms|Amfibibot|AmPmPPC.com|AMZNKAssocBot|Anemone|Anonymous|Anonymouse.org|AnotherBot|AnswerBot|AnswerBus|AnswerChase\ PROve|AntBot|antibot-|AntiSantyWorm|Antro.Net|AONDE-Spider|Aport|Aqua_Products|AraBot|Arachmo|Arachnophilia|archive.org_bot|aria\ eQualizer|arianna.libero.it|Arikus_Spider|Art-Online.com|ArtavisBot|Artera|ASpider|ASPSeek|asterias|AstroFind|athenusbot|AtlocalBot|Atomic_Email_Hunter|attach|attrakt|attributor|Attributor.comBot|augurfind|AURESYS|AutoBaron|autoemailspider|autowebdir|AVSearch-|axfeedsbot|Axonize-bot|Ayna|b2w|BackDoorBot|BackRub|BackStreet\ Browser|BackWeb|Baiduspider-video|Bandit|BatchFTP|baypup|BDFetch|BecomeBot|BecomeJPBot|BeetleBot|Bender|besserscheitern-crawl|betaBot|Big\ Brother|Big\ Data|Bigado.com|BigCliqueBot|Bigfoot|BIGLOTRON|Bilbo|BilgiBetaBot|BilgiBot|binlar|bintellibot|bitlybot|BitvoUserAgent|Bizbot003|BizBot04|BizBot04\ kirk.overleaf.com|Black.Hole|Black\ Hole|Blackbird|BlackWidow|bladder\ fusion|Blaiz-Bee|BLEXBot|Blinkx|BlitzBOT|Blog\ Conversation\ Project|BlogMyWay|BlogPulseLive|BlogRefsBot|BlogScope|Blogslive|BloobyBot|BlowFish|BLT|bnf.fr_bot|BoaConstrictor|BoardReader-Image-Fetcher|BOI_crawl_00|BOIA-Scan-Agent|BOIA.ORG-Scan-Agent|boitho.com-dc|Bookmark\ Buddy|bosug|Bot\ Apoena|BotALot|BotRightHere|Botswana|bottybot|BpBot|BRAINTIME_SEARCH|BrokenLinkCheck.com|BrowserEmulator|BrowserMob|BruinBot|BSearchR&D|BSpider|btbot|Btsearch|Buddy|Buibui|BuildCMS|BuiltBotTough|Bullseye|bumblebee|BunnySlippers|BuscadorClarin|Butterfly|BuyHawaiiBot|BuzzBot|byindia|BySpider|byteserver|bzBot|c\ r\ a\ w\ l\ 3\ r|CacheBlaster|CACTVS\ Chemistry|Caddbot|Cafi|Camcrawler|CamelStampede|Canon-WebRecord|Canon-WebRecordPro|CareerBot|casper|cataguru|CatchBot|CazoodleBot|CCBot|CCGCrawl|ccubee|CD-Preload|CE-Preload|Cegbfeieh|Cerberian\ Drtrs|CERT\ FigleafBot|cfetch|CFNetwork|Chameleon|ChangeDetection|Charlotte|Check&Get|Checkbot|Checklinks|checkprivacy|CheeseBot|ChemieDE-NodeBot|CherryPicker|CherryPickerElite|CherryPickerSE|Chilkat|ChinaClaw|CipinetBot|cis455crawler|citeseerxbot|cizilla.com|ClariaBot|clshttp|Clushbot|cmsworldmap|coccoc|CollapsarWEB|Collector|combine|comodo|conceptbot|ConnectSearch|conpilot|ContentSmartz|ContextAd|contype|cookieNET|CoolBott|CoolCheck|Copernic|Copier|CopyRightCheck|core-project|cosmos|Covario-IDS|Cowbot-|Cowdog|crabbyBot|crawl|Crawl_Application|crawl.UserAgent|CrawlConvera|crawler|crawler_for_infomine|CRAWLER-ALTSE.VUNET.ORG-Lynx|crawler-upgrade-config|crawler.kpricorn.org|crawler#|crawler4j|crawler43.ejupiter.com|Crawly|CreativeCommons|Crescent|Crescent\ Internet\ ToolPak\ HTTP\ OLE\ Control|cs-crawler|CSE\ HTML\ Validator|CSHttpClient|Cuasarbot|culsearch|Curl|Custo|Cutbot|cvaulev|Cyberdog|CyberNavi_WebGet|CyberSpyder|CydralSpider).*$" />
<add input="{HTTP_USER_AGENT}" pattern="^.*(D1GArabicEngine|DA|DataCha0s|DataFountains|DataparkSearch|DataSpearSpiderBot|DataSpider|Dattatec.com|Dattatec.com-Sitios-Top|Daumoa|DAUMOA-video|DAUMOA-web|Declumbot|Deepindex|deepnet|DeepTrawl|dejan|del.icio.us-thumbnails|DelvuBot|Deweb|DiaGem|Diamond|DiamondBot|diavol|DiBot|didaxusbot|DigExt|Digger|DiGi-RSSBot|DigitalArchivesBot|DigOut4U|DIIbot|Dillo|Dir_Snatch.exe|DISCo|DISCo\ Pump|discobot|DISCoFinder|Distilled-Reputation-Monitor|Dit|DittoSpyder|DjangoTraineeBot|DKIMRepBot|DoCoMo|DOF-Verify|domaincrawler|DomainScan|DomainWatcher|dotbot|DotSpotsBot|Dow\ Jonesbot|Download|Download\ Demon|Downloader|DOY|dragonfly|Drip|drone|DTAAgent|dtSearchSpider|dumbot|Dwaar|Dwaarbot|DXSeeker|EAH|EasouSpider|EasyDL|ebingbong|EC2LinkFinder|eCairn-Grabber|eCatch|eChooseBot|ecxi|EdisterBot|EduGovSearch|egothor|eidetica.com|EirGrabber|ElisaBot|EllerdaleBot|EMail\ Exractor|EmailCollector|EmailLeach|EmailSiphon|EmailWolf|EMPAS_ROBOT|EnaBot|endeca|EnigmaBot|Enswer\ Neuro|EntityCubeBot|EroCrawler|eStyleSearch|eSyndiCat|Eurosoft-Bot|Evaal|Eventware|Everest-Vulcan|Exabot|Exabot-Images|Exabot-Test|Exabot-XXX|ExaBotTest|ExactSearch|exactseek.com|exooba|Exploder|explorersearch|extract|Extractor|ExtractorPro|EyeNetIE|ez-robot|Ezooms|factbot|FairAd\ Client|falcon|Falconsbot|fast-search-engine|FAST\ Data\ Document|FAST\ ESP|fastbot|fastbot.de|FatBot|Favcollector|Faviconizer|FDM|FedContractorBot|feedfinder|FelixIDE|fembot|fetch_ici|Fetch\ API\ Request|fgcrawler|FHscan|fido|Filangy|FileHound|FindAnISP.com_ISP_Finder|findlinks|FindWeb|Firebat|Fish-Search-Robot|Flaming\ AttackBot|Flamingo_SearchEngine|FlashCapture|FlashGet|flicky|FlickySearchBot|flunky|focused_crawler|FollowSite|Foobot|Fooooo_Web_Video_Crawl|Fopper|FormulaFinderBot|Forschungsportal|fr_crawler|Francis|Freecrawl|FreshDownload|freshlinks.exe|FriendFeedBot|frodo.at|froGgle|FrontPage|Froola|FU-NBI|full_breadth_crawler|FunnelBack|FunWebProducts|FurlBot|g00g1e|G10-Bot|Gaisbot|GalaxyBot|gazz|gcreep|generate_infomine_category_classifiers|genevabot|genieBot|GenieBotRD_SmallCrawl|Genieo|Geomaxenginebot|geometabot|GeonaBot|GeoVisu|GermCrawler|GetHTMLContents|Getleft|GetRight|GetSmart|GetURL.rexx|GetWeb!|Giant|GigablastOpenSource|Gigabot|Girafabot|GleameBot|gnome-vfs|Go-Ahead-Got-It|Go!Zilla|GoForIt.com|GOFORITBOT|gold|Golem|GoodJelly|Gordon-College-Google-Mini|goroam|GoSeebot|gotit|Govbot|GPU\ p2p|grab|Grabber|GrabNet|Grafula|grapeFX|grapeshot|GrapeshotCrawler|grbot|GreenYogi\ [ZSEBOT]|Gromit|GroupMe|grub|grub-client|Grubclient-|GrubNG|GruBot|gsa|GSLFbot|GT::WWW|Gulliver|GulperBot|GurujiBot|GVC|GVC\ BUSINESS|gvcbot.com|HappyFunBot|harvest|HarvestMan|Hatena\ Antenna|Hawler|Hazel's\ Ferret\ hopper|hcat|hclsreport-crawler|HD\ nutch\ agent|Header_Test_Client|healia|Helix|heritrix|hijbul-heritrix-crawler|HiScan|HiSoftware\ AccMonitor|HiSoftware\ AccVerify|hitcrawler_|hivaBot|hloader|HMSEbot|HMView|hoge|holmes|HomePageSearch|Hooblybot-Image|HooWWWer|Hostcrawler|HSFT\ -\ Link|HSFT\ -\ LVU|HSlide|ht:|htdig|Html\ Link\ Validator|HTMLParser|HTTP::Lite|httplib|HTTrack|Huaweisymantecspider|hul-wax|humanlinks|HyperEstraier|Hyperix).*$" />
<add input="{HTTP_USER_AGENT}" pattern="^.*(ia_archiver|IAArchiver-|ibuena|iCab|ICDS-Ingestion|ichiro|iCopyright\ Conductor|id-search|IDBot|IEAutoDiscovery|IECheck|iHWebChecker|IIITBOT|iim_405|IlseBot|IlTrovatore|Iltrovatore-Setaccio|ImageBot|imagefortress|ImagesHereImagesThereImagesEverywhere|ImageVisu|imds_monitor|imo-google-robot-intelink|IncyWincy|Industry\ Cortexcrawler|Indy\ Library|indylabs_marius|InelaBot|Inet32\ Ctrl|inetbot|InfoLink|INFOMINE|infomine.ucr.edu|InfoNaviRobot|Informant|Infoseek|InfoTekies|InfoUSABot|INGRID|Inktomi|InsightsCollector|InsightsWorksBot|InspireBot|InsumaScout|Intelix|InterGET|Internet\ Ninja|InternetLinkAgent|Interseek|IOI|ip-web-crawler.com|Ipselonbot|Iria|IRLbot|Iron33|Isara|iSearch|iSiloX|IsraeliSearch|IstellaBot|its-learning|IU_CSCI_B659_class_crawler|iVia|iVia\ Page\ Fetcher|JadynAve|JadynAveBot|jakarta|Jakarta\ Commons-HttpClient|Java|Jbot|JemmaTheTourist|JennyBot|Jetbot|JetBrains\ Omea\ Pro|JetCar|Jim|JoBo|JobSpider_BA|JOC|JoeDog|JoyScapeBot|JSpyda|JubiiRobot|jumpstation|Junut|JustView|Jyxobot|K.S.Bot|KakcleBot|kalooga|KaloogaBot|kanagawa|KATATUDO-Spider|Katipo|kbeta1|Kenjin.Spider|KeywenBot|Keyword.Density|Keyword\ Density|kinjabot|KIT-Fireball|Kitenga-crawler-bot|KiwiStatus|kmbot-|kmccrew|Knight|KnowItAll|Knowledge.com|Knowledge\ Engine|KoepaBot|Koninklijke|KrOWLer|KSbot|kuloko-bot|kulturarw3|KummHttp|Kurzor|Kyluka|L.webis|LabelGrab|Labhoo|labourunions411|lachesis|Lament|LamerExterminator|LapozzBot|larbin|LARBIN-EXPERIMENTAL|LBot|LeapTag|LeechFTP|LeechGet|LetsCrawl.com|LexiBot|LexxeBot|lftp|libcrawl|libiViaCore|libWeb|libwww|libwww-perl|likse|Linguee|Link|link_checker|LinkAlarm|linkbot|LinkCheck\ by\ Siteimprove.com|LinkChecker|linkdex.com|LinkextractorPro|LinkLint|linklooker|Linkman|LinkScan|LinksCrawler|LinksManager.com_bot|LinkSweeper|linkwalker|LiteFinder|LitlrBot|Little\ Grabber\ at\ Skanktale.com|Livelapbot|LM\ Harvester|LMQueueBot|LNSpiderguy|LoadTimeBot|LocalcomBot|locust|LolongBot|LookBot|Lsearch|lssbot|LWP|lwp-request|lwp-trivial|LWP::Simple|Lycos_Spider|Lydia\ Entity|LynnBot|Lytranslate|Mag-Net|Magnet|magpie-crawler|Magus|Mail.Ru|Mail.Ru_Bot|MAINSEEK_BOT|Mammoth|MarkWatch|MaSagool|masidani_bot_|Mass|Mata.Hari|Mata\ Hari|matentzn\ at\ cs\ dot\ man\ dot\ ac\ dot\ uk|maxamine.com--robot|maxamine.com-robot|maxomobot|Maxthon$|McBot|MediaFox|medrabbit|Megite|MemacBot|Memo|MendeleyBot|Mercator-|mercuryboard_user_agent_sql_injection.nasl|MerzScope|metacarta|Metager2|metager2-verification-bot|MetaGloss|METAGOPHER|metal|metaquerier.cs.uiuc.edu|METASpider|Metaspinner|MetaURI|MetaURI\ API|MFC_Tear_Sample|MFcrawler|MFHttpScan|Microsoft.URL|MIIxpc|miner|mini-robot|minibot|miniRank|Mirror|Missigua\ Locator|Mister.PiX|Mister\ PiX|Miva|MJ12bot|mnoGoSearch|mod_accessibility|moduna.com|moget|MojeekBot|MOMspider|MonkeyCrawl|MOSES|Motor|mowserbot|MQbot|MSE360|MSFrontPage|MSIECrawler|MSIndianWebcrawl|MSMOBOT|Msnbot|msnbot-products|MSNPTC|MSRBOT|MT-Soft|MultiText|My_Little_SearchEngine_Project|my-heritrix-crawler|MyApp|MYCOMPANYBOT|mycrawler|MyEngines-US-Bot|MyFamilyBot|Myra|nabot|nabot_|Najdi.si|Nambu|NAMEPROTECT|NatchCVS|naver|naverbookmarkcrawler|NaverBot|Navroad|NearSite|NEC-MeshExplorer|NeoScioCrawler|NerdByNature.Bot|NerdyBot|Nerima-crawl-).*$" />
<add input="{HTTP_USER_AGENT}" pattern="^.*(T-H-U-N-D-E-R-S-T-O-N-E|Tailrank|tAkeOut|TAMU_CRAWLER|TapuzBot|Tarantula|targetblaster.com|TargetYourNews.com|TAUSDataBot|taxinomiabot|Tecomi|TeezirBot|Teleport|Teleport\ Pro|TeleportPro|Telesoft|Teradex\ Mapper|TERAGRAM_CRAWLER|TerrawizBot|testbot|testing\ of|TextBot|thatrobotsite.com|The.Intraformant|The\ Dyslexalizer|The\ Intraformant|TheNomad|Theophrastus|theusefulbot|TheUsefulbot_|ThumbBot|thumbshots-de-bot|tigerbot|TightTwatBot|TinEye|Titan|to-dress_ru_bot_|to-night-Bot|toCrawl|Topicalizer|topicblogs|Toplistbot|TopServer\ PHP|topyx-crawler|Touche|TourlentaScanner|TPSystem|TRAAZI|TranSGeniKBot|travel-search|TravelBot|TravelLazerBot|Treezy|TREX|TridentSpider|Trovator|True_Robot|tScholarsBot|TsWebBot|TulipChain|turingos|turnit|TurnitinBot|TutorGigBot|TweetedTimes|TweetmemeBot|TwengaBot|TwengaBot-Discover|Twiceler|Twikle|twinuffbot|Twisted\ PageGetter|Twitturls|Twitturly|TygoBot|TygoProwler|Typhoeus|U.S.\ Government\ Printing\ Office|uberbot|ucb-nutch|UCSD-Crawler|UdmSearch|UFAM-crawler-|Ultraseek|UnChaos|unchaos_crawler_|UnisterBot|UniversalSearch|UnwindFetchor|UofTDB_experiment|updated|URI::Fetch|url_gather|URL-Checker|URL\ Control|URLAppendBot|URLBlaze|urlchecker|urlck|UrlDispatcher|urllib|URLSpiderPro|URLy.Warning|USAF\ AFKN\|usasearch|USS-Cosmix|USyd-NLP-Spider|Vacobot|Vacuum|VadixBot|Vagabondo|Validator|Valkyrie|vBSEO|VCI|VerbstarBot|VeriCiteCrawler|Verifactrola|Verity-URL-Gateway|vermut|versus|versus.integis.ch|viasarchivinginformation.html|vikspider|VIP|VIPr|virus-detector|VisBot|Vishal\ For\ CLIA|VisWeb|vlad|vlsearch|VMBot|VocusBot|VoidEYE|VoilaBot|Vortex|voyager|voyager-hc|voyager-partner-deep|VSE|vspider).*$" />
<add input="{HTTP_USER_AGENT}" pattern="^.*(W3C_Unicorn|W3C-WebCon|w3m|w3search|wacbot|wastrix|Water\ Conserve|Water\ Conserve\ Portal|WatzBot|wauuu\ engine|Wavefire|Waypath|Wazzup|Wazzup1.0.4800|wbdbot|web-agent|Web-Sniffer|Web.Image.Collector|Web\ CEO\ Online|Web\ Image\ Collector|Web\ Link\ Validator|Web\ Magnet|webalta|WebaltBot|WebAuto|webbandit|webbot|webbul-bot|WebCapture|webcheck|Webclipping.com|webcollage|WebCopier|WebCopy|WebCorp|webcrawl.net|webcrawler|WebDownloader\ for|Webdup|WebEMailExtrac|WebEMailExtrac.*|WebEnhancer|WebFerret|webfetch|WebFetcher|WebGather|WebGo\ IS|webGobbler|WebImages|Webinator-search2.fasthealth.com|Webinator-WBI|WebIndex|WebIndexer|weblayers|WebLeacher|WeblexBot|WebLinker|webLyzard|WebmasterCoffee|WebmasterWorld|WebmasterWorldForumBot|WebMiner|WebMoose|WeBot|WebPix|WebReaper|WebRipper|WebSauger|Webscan|websearchbench|WebSite|websitemirror|WebSpear|websphinx.test|WebSpider|Webster|Webster.Pro|Webster\ Pro|WebStripper|WebTrafficExpress|WebTrends\ Link\ Analyzer|webvac|webwalk|WebWalker|Webwasher|WebWatch|WebWhacker|WebXM|WebZip|Weddings.info|wenbin|WEPA|WeRelateBot|Whacker|Widow|WikiaBot|Wikio|wikiwix-bot-|WinHttp.WinHttpRequest|WinHTTP\ Example|WIRE|wired-digital-newsbot|WISEbot|WISENutbot|wish-la|wish-project|wisponbot|WMCAI-robot|wminer|WMSBot|woriobot|worldshop|WorQmada|Wotbox|WPScan|wume_crawler|WWW-Mechanize|www.freeloader.com.|WWW\ Collector|WWWOFFLE|wwwrobot|wwwster|WWWWanderer|wwwxref|Wysigot|X-clawler|Xaldon|Xenu|Xenu's|Xerka\ MetaBot|XGET|xirq|XmarksFetch|XoviBot|xqrobot|Y!J|Y!TunnelPro|yacy.net|yacybot|yarienavoir.net|Yasaklibot|yBot|YebolBot|yellowJacket|yes|YesupBot|Yeti|YioopBot|YisouSpider|yolinkBot|yoogliFetchAgent|yoono|Yoriwa|YottaCars_Bot|you-dir|Z-Add\ Link|zagrebin|Zao|zedzo.digest|zedzo.validate|zermelo|Zeus|Zeus\ Link\ Scout|zibber-v|zimeno|Zing-BottaBot|ZipppBot|zmeu|ZoomSpider|ZuiBot|ZumBot|Zyborg|Zyte).*$" />
<add input="{HTTP_USER_AGENT}" pattern="^.*(Nessus|NESSUS::SOAP|nestReader|Net::Trackback|NetAnts|NetCarta\ CyberPilot\ Pro|Netcraft|NetID.com|NetMechanic|Netprospector|NetResearchServer|NetScoop|NetSeer|NetShift=|NetSongBot|Netsparker|NetSpider|NetSrcherP|NetZip|NetZip-Downloader|NewMedhunt|news|News_Search_App|NewsGatherer|Newsgroupreporter|NewsTroveBot|NextGenSearchBot|nextthing.org|NHSEWalker|nicebot|NICErsPRO|niki-bot|NimbleCrawler|nimbus-1|ninetowns|Ninja|NjuiceBot|NLese|Nogate|Nomad-V2.x|NoteworthyBot|NPbot|NPBot-|NRCan\ intranet|NSDL_Search_Bot|nu_tch-princeton|nuggetize.com|nutch|nutch1|NutchCVS|NutchOrg|NWSpider|Nymesis|nys-crawler|ObjectsSearch|oBot|Obvius\ external\ linkcheck|Occam|Ocelli|Octopus|ODP\ entries|Offline.Explorer|Offline\ Explorer|Offline\ Navigator|OGspider|OmiExplorer_Bot|OmniExplorer_Bot|omnifind|OmniWeb|OnetSzukaj|online\ link\ validator|OOZBOT|Openbot|Openfind|Openfind\ data|OpenHoseBot|OpenIntelligenceData|OpenISearch|OpenSearchServer_Bot|OpiDig|optidiscover|OrangeBot|ORISBot|ornl_crawler_1|ORNL_Mercury|osis-project.jp|OsO|OutfoxBot|OutfoxMelonBot|OWLER-BOT|owsBot|ozelot|P3P\ Client|page_verifier|PageBitesHyperBot|Pagebull|PageDown|PageFetcher|PageGrabber|PagePeeker|PageRank\ Monitor|pamsnbot.htm|Panopy|panscient.com|Pansophica|Papa\ Foto|PaperLiBot|parasite|parsijoo|Pathtraq|Pattern|Patwebbot|pavuk|PaxleFramework|PBBOT|pcBrowser|pd-crawler|PECL::HTTP|penthesila|PeoplePal|perform_crawl|PerMan|PGP-KA|PHPCrawl|PhpDig|PicoSearch|pipBot|pipeLiner|Pita|pixfinder|PiyushBot|planetwork|PleaseCrawl|Plucker|Plukkie|Plumtree|Pockey|Pockey-GetHTML|PoCoHTTP|pogodak.ba|Pogodak.co.yu|Poirot|polybot|Pompos|Poodle\ predictor|PopScreenBot|PostPost|PrivacyFinder|ProjectWF-java-test-crawler|ProPowerBot|ProWebWalker|PROXY|psbot|psbot-page|PSS-Bot|psycheclone|pub-crawler|pucl|pulseBot\ \(pulse|Pump|purebot|PWeBot|pycurl|Python-urllib|pythonic-crawler|PythonWikipediaBot|q1|QEAVis\ agent|QFKBot|qualidade|Qualidator.com|QuepasaCreep|QueryN.Metasearch|QueryN\ Metasearch|quest.durato|Quintura-Crw|QunarBot|Qweery_robot.txt_CheckBot|QweeryBot|r2iBot|R6_CommentReader|R6_FeedFetcher|R6_VoteReader|RaBot|Radian6|radian6_linkcheck|RAMPyBot|RankurBot|RcStartBot|RealDownload|Reaper|REBI-shoveler|Recorder|RedBot|RedCarpet|ReGet|RepoMonkey|RepoMonkey\ Bait|Riddler|RIIGHTBOT|RiseNetBot|RiverglassScanner|RMA|RoboPal|Robosourcer|robot|robotek|robots|Robozilla|rogerBot|Rome\ Client|Rondello|Rotondo|Roverbot|RPT-HTTPClient|rtgibot|RufusBot|Runnk\ online\ rss\ reader|s~stremor-crawler|S2Bot|SafariBookmarkChecker|SaladSpoon|Sapienti|SBIder|SBL-BOT|SCFCrawler|Scich|ScientificCommons.org|ScollSpider|ScooperBot|Scooter|ScoutJet|ScrapeBox|Scrapy|SCrawlTest|Scrubby|scSpider|Scumbot|SeaMonkey$|Search-Channel|Search-Engine-Studio|search.KumKie.com|search.msn.com|search.updated.com|search.usgs.gov|Search\ Publisher|Searcharoo.NET|SearchBlox|searchbot|searchengine|searchhippo.com|SearchIt-Bot|searchmarking|searchmarks|searchmee_v|SearchmetricsBot|searchmining|SearchnowBot_v1|searchpreview|SearchSpider.com|SearQuBot|Seekbot|Seeker.lookseek.com|SeeqBot|seeqpod-vertical-crawler|Selflinkchecker|Semager|semanticdiscovery|Semantifire1|semisearch|SemrushBot|Senrigan|SEOENGWorldBot|SeznamBot|ShablastBot|ShadowWebAnalyzer|Shareaza|Shelob|sherlock|ShopWiki|ShowLinks|ShowyouBot|siclab|silk|Siphon|SiteArchive|SiteCheck-sitecrawl|sitecheck.internetseer.com|SiteFinder|SiteGuardBot|SiteOrbiter|SiteSnagger|SiteSucker|SiteSweeper|SiteXpert|SkimBot|SkimWordsBot|SkreemRBot|skygrid|Skywalker|Sleipnir|slow-crawler|SlySearch|smart-crawler|SmartDownload|Smarte|smartwit.com|Snake|Snapbot|SnapPreviewBot|Snappy|snookit|Snooper|Snoopy|SocialSearcher|SocSciBot|SOFT411\ Directory|sogou|sohu-search|sohu\ agent|Sokitomi|Solbot|sootle|Sosospider|Space\ Bison|Space\ Fung|SpaceBison|SpankBot|spanner|Spatineo\ Monitor\ Controller|special_archiver|SpeedySpider|Sphider|Sphider2|spider|Spider.TerraNautic.net|SpiderEngine|SpiderKU|SpiderMan|Spinn3r|Spinne|sportcrew-Bot|spyder3.microsys.com|sqlmap|Squid-Prefetch|SquidClamAV_Redirector|Sqworm|SrevBot|sslbot|SSM\ Agent|StackRambler|StarDownloader|statbot|statcrawler|statedept-crawler|Steeler|STEGMANN-Bot|stero|Stripper|Stumbler|suchclip|sucker|SumeetBot|SumitBot|SummizeBot|SummizeFeedReader|SuperBot|superbot.com|SuperHTTP|SuperLumin|SuperPagesBot|Supybot|SURF|Surfbot|SurfControl|SurveyBot|suzuran|SWEBot|swish-e|SygolBot|SynapticWalker|Syntryx\ ANT\ Scout\ Chassis\ Pheromone|SystemSearch-robot|Szukacz).*$" />
</conditions>
<action type="CustomResponse" statusCode="403" statusReason="Forbidden" statusDescription="Forbidden" />
</rule>
</rules>
</rewrite>
More info:
http://www.blogtips.org/web-crawlers-love-the-good-but-kill-the-bad-and-the-ugly/
https://perishablepress.com/eight-ways-to-blacklist-with-apaches-mod_rewrite/
No, it cant be achieved by web config file.
Dealing with search bots is done with robots.txt, its pretty simple solution. As written above, crawlers cannot see your web config file.
But, in code you can determine that HttpUserAgent is google bot, after you determine that, u dont write content to web page, just send Response code 404, or which i think is better, code 410.
Im not 100% behind above solution because there was no need to use that, but logically i think that would do the same. And you would just waste google energy to crawl your page :)
Web.config is obviously not accessible to users or crawlers. You should be able to create a route "/robots.txt", if your application resides in the root folder of the website, and return the text that normally should be inside that file from a controller, OWIN middleware or HTTP handler. But why?
As explained in other answers the way to prevent bots from crawling a page is by using robots.txt.
However based on your description this might not even be needed.
Here is why:
In order for a page to be crawled it needs to be linked to. There are two ways for this to happen:
The page is linked in your root/front web page (i.e. the page shown by typing httpx://mysite.domain/). My understanding is this does not happen. The information you want out of search engines is only available from specific deep (secret?) URLs not linked in the web site.
The page is linked another web-site. I.e. someone posts the URL to the information in a different web-site. In this case if you want to prevent the content of the page being crawled you need to use robots.txt, however the URL itself will be indexed in the search engine as this is information in a different web page.
The question then is how likely is option 2? For an example of public URLs that are not indexed unless published by a third party see dropbox/google drive/one drive URLs used for sharing to everyone. The URL usually includes a GUID which by definition is practically impossible to guess or brute force.
I am trying to add some [WebMethod] annotated endpoint functions to a Webforms style web app (.aspx and .asmx).
I'd like to annotate those endpoints with [EnableCors] and thereby get all the good ajax-preflight functionality.
VS2013 accepts the annotation, but still the endpoints don't play nice with CORS. (They work fine when used same-origin but not cross-origin).
I can't even get them to function cross-origin with the down and dirty
HttpContext.Current.Response.AppendHeader("Access-Control-Allow-Origin", "*");
approach -- my browsers reject the responses, and the cross-origin response headers don't appear.
How can I get CORS functionality in these [WebMethod] endpoints?
I recommend double-checking you have performed all steps on this page: CORS on ASP.NET
In addition to:
Response.AppendHeader("Access-Control-Allow-Origin", "*");
Also try:
Response.AppendHeader("Access-Control-Allow-Methods","*");
Try adding directly in web config:
<system.webServer>
<httpProtocol>
<customHeaders>
<add name="Access-Control-Allow-Methods" value="*" />
<add name="Access-Control-Allow-Headers" value="Content-Type" />
</customHeaders>
</httpProtocol>
</system.webServer>
Failing that, you need to ensure you have control over both domains.
FYI, enable CORS in classic webform. In Global.asax
void Application_Start(object sender, EventArgs e)
{
GlobalConfiguration.Configuration.EnableCors();
RouteTable.Routes.MapHttpRoute(
name: "DefaultApi",
routeTemplate: "api/{controller}/{action}/{id}",
defaults: new { id = System.Web.Http.RouteParameter.Optional }
);
If you need the preflight request, e.g. so you can send authenticated requests, you are not able to set Access-Control-Allow-Origin: *. It must be a specific Origin domain.
Also you must set the Access-Control-Allow-Methods and Access-Control-Allow-Headers response headers, if you are using anything besides the defaults.
(Note these constraints are just how CORS itself works - this is how it is defined.)
So, it's not enough to just throw on the [EnableCors] attribute, you have to set values to the parameters:
[EnableCors(origins: "https://www.olliejones.com", headers: "X-Custom-Header", methods: "PUT", SupportsCredentials = true)]
Or if you want to do things manually and explicitly:
HttpContext.Current.Response.AppendHeader("Access-Control-Allow-Origin", "https://www.olliejones.com");
HttpContext.Current.Response.AppendHeader("Access-Control-Allow-Headers", "X-Custom-Header");
HttpContext.Current.Response.AppendHeader("Access-Control-Allow-Methods", "PUT");
HttpContext.Current.Response.AppendHeader("Access-Control-Allow-Credentials", "true");
One last thing - you do have to call .EnableCors() on initiation. In e.g. MVC or WebAPI, you would call this on HttpConfiguration, when registering the config and such - however I have no idea how it works with WebForms.
If you use the AppendHeader method to send cache-specific headers and at the same time use the cache object model (Cache) to set cache policy, HTTP response headers that pertain to caching might be deleted when the cache object model is used. This behavior enables ASP.NET to maintain the most restrictive settings. For example, consider a page that includes user controls. If those controls have conflicting cache policies, the most restrictive cache policy will be used. If one user control sets the header "Cache-Control: Public" and another user control sets the more restrictive header "Cache-Control: Private" via calls to SetCacheability, then the "Cache-Control: Private" header will be sent with the response.
You can create a httpProtocol in web config for customHeaders.
<httpProtocol>
<customHeaders>
<add name="Access-Control-Allow-Methods" values="*" />
</customHeaders>
<httpProtocol>
I think your code looks good, but IIS does not send the header entity alone with response expectedly. Please check whether IIS is configured properly.
Configuring IIS6
Configuring IIS7
If CORS doesn't work for your particularity problem, maybe jsonp is another possible way.
For the web form, you can use
Response.AddHeader("Access-Control-Allow-Origin", "*");
instead of
Response.AppendHeader("Access-Control-Allow-Origin", "*");
The first one works for old version of ASP.Net Web Form.
You can do like this in MVC
[EnableCors(origins: "*", headers: "*", methods: "*")]
public ActionResult test()
{
Response.AppendHeader("Access-Control-Allow-Origin", "*");
return View();
}
I'm wondering how application insights work with cookies because I'll like to understand user and session tracking, so I've been researching and...
Here is a brief introduction about the theory:
Whenever Application Insights SDK get a request that doesn’t have application insights user tracking cookie (set by Application Insights JS snippet) it will set this cookie and start a new session.
(from apmtips )
2.
UserTelemetryInitializer updates the Id and AcquisitionDate properties of User context for all telemetry items with values extracted from the ai_user cookie generated by the Application Insights JavaScript instrumentation code running in the user's browser.
SessionTelemetryInitializer updates the Id property of the Session context for all telemetry items with value extracted from the ai_session cookie generated by the ApplicationInsights JavaScript instrumentation code running in the user's browser.
(from azure documentation (Configuring the Application Insights SKD with ApplicationInsights.config))
So there are two cookies: ai_session, and ai_user.
And here comes my questions:
When are they initialized?
What is doing it?
How can I stop using them?
If I wanted to keep them, how could I change their expiration time?
Trying to remove them I made a project using ASP.NET Web Applications using the default template for Web Api, which includes MVC and Web Api.
Doing a research I found this approach to disable them but I don't have any WebSessionTrackingTelemetryModule. So I commented out "UserTelemetryInitializer" and "SessionTelemetryInitializer" and this is what I have:
<TelemetryInitializers>
<Add Type="Microsoft.ApplicationInsights.Extensibility.Web.SyntheticTelemetryInitializer, Microsoft.ApplicationInsights.Extensibility.Web" />
<Add Type="Microsoft.ApplicationInsights.Extensibility.Web.ClientIpHeaderTelemetryInitializer, Microsoft.ApplicationInsights.Extensibility.Web" />
<Add Type="Microsoft.ApplicationInsights.Extensibility.Web.UserAgentTelemetryInitializer, Microsoft.ApplicationInsights.Extensibility.Web" />
<Add Type="Microsoft.ApplicationInsights.Extensibility.Web.OperationNameTelemetryInitializer, Microsoft.ApplicationInsights.Extensibility.Web" />
<Add Type="Microsoft.ApplicationInsights.Extensibility.Web.OperationIdTelemetryInitializer, Microsoft.ApplicationInsights.Extensibility.Web" />
<!--<Add Type="Microsoft.ApplicationInsights.Extensibility.Web.UserTelemetryInitializer, Microsoft.ApplicationInsights.Extensibility.Web" />-->
<!--<Add Type="Microsoft.ApplicationInsights.Extensibility.Web.SessionTelemetryInitializer, Microsoft.ApplicationInsights.Extensibility.Web" />-->
<Add Type="Microsoft.ApplicationInsights.Extensibility.Web.AzureRoleEnvironmentTelemetryInitializer, Microsoft.ApplicationInsights.Extensibility.Web" />
<Add Type="Microsoft.ApplicationInsights.Extensibility.Web.DomainNameRoleInstanceTelemetryInitializer, Microsoft.ApplicationInsights.Extensibility.Web" />
<Add Type="Microsoft.ApplicationInsights.Extensibility.Web.BuildInfoConfigComponentVersionTelemetryInitializer, Microsoft.ApplicationInsights.Extensibility.Web" />
<Add Type="Microsoft.ApplicationInsights.Extensibility.Web.DeviceTelemetryInitializer, Microsoft.ApplicationInsights.Extensibility.Web" />
</TelemetryInitializers>
And :
<TelemetryModules>
<Add Type="Microsoft.ApplicationInsights.Extensibility.DependencyCollector.DependencyTrackingTelemetryModule, Microsoft.ApplicationInsights.Extensibility.DependencyCollector" />
<Add Type="Microsoft.ApplicationInsights.Extensibility.PerfCounterCollector.PerformanceCollectorModule, Microsoft.ApplicationInsights.Extensibility.PerfCounterCollector"/>
<Add Type="Microsoft.ApplicationInsights.Extensibility.Implementation.Tracing.DiagnosticsTelemetryModule, Microsoft.ApplicationInsights" />
<Add Type="Microsoft.ApplicationInsights.Extensibility.Web.RequestTrackingTelemetryModule, Microsoft.ApplicationInsights.Extensibility.Web"/>
<Add Type="Microsoft.ApplicationInsights.Extensibility.Web.ExceptionTrackingTelemetryModule, Microsoft.ApplicationInsights.Extensibility.Web" />
<Add Type="Microsoft.ApplicationInsights.Extensibility.Web.DeveloperModeWithDebuggerAttachedTelemetryModule, Microsoft.ApplicationInsights.Extensibility.Web" />
</TelemetryModules>
But it doesn't make a difference. Either I leave the modules commented or not, the cookies are still being generated.
Trying to remove the cookies, I commented the steps done in Startup and exclude from my project all the .js files, but the cookies keep appearing after every request.
So at this point I don't understand where the "Application Insights Javascript" takes place and I guess that what I'm missing is something in the backend. Am I wrong?
Finally, my commented Startup.cs looks like:
[assembly: OwinStartupAttribute(typeof(Try001.Startup))]
namespace Try001
{
public partial class Startup
{
public void Configuration(IAppBuilder app)
{
//ConfigureAuth(app);
}
}
}
And my Global.asax.cs looks like:
public class MvcApplication : System.Web.HttpApplication
{
protected void Application_Start()
{
//AreaRegistration.RegisterAllAreas();
//FilterConfig.RegisterGlobalFilters(GlobalFilters.Filters);
RouteConfig.RegisterRoutes(RouteTable.Routes);
//BundleConfig.RegisterBundles(BundleTable.Bundles);
}
}
Where RegisterRoutes is just doing the default routing. So I aimed to do just the very basic stuff to get it working, but I don't have a clue about where to keep digging.
Could someone enlighten me?
Thanks for reading so far.
Cookie initialization logic happens in Application Insights JavaScript SDK. If you look in the source of your page you will notice JS from //az416426.vo.msecnd.net/scripts/a/ai.0.js. You can also read/contribute to the source code of JavaScript SDK on GitHub: https://github.com/Microsoft/ApplicationInsights-JS
Replying to your questions:
When are they initialized and what is doing it?
They are initialized by JavaScript SDK when it attempts to send any telemetry item and checks if the cookie are not present, it creates them. For details see https://github.com/Microsoft/ApplicationInsights-JS/blob/master/JavaScript/JavaScriptSDK/Context/User.ts, there's also similar logic for session cookie.
How can I stop using them?
As of the more recent versions of the JavaScript SDK, you can now control the cookies as well as the local storage for both user information and the session buffer (used to rate limit the requests to AI) through the config object:
...snippet...
}({
instrumentationKey: "<your key>",
isCookieUseDisabled: true,
isStorageUseDisabled: true,
enableSessionStorageBuffer: true
});
If I wanted to keep them, how could I change their expiration time? There are two settings you can control:
session renewal time - how much time elapses before session is reset
with no activity (default is 30 minutes)
session expiration time - how much time
elapses before session is reset even with activity (default is 24 hours).
To change them set following values in this snippet next to where instrumentation key is set:
..snippet..
}({
instrumentationKey: "<your key>",
sessionRenewalMs:<your custom value in ms>,
sessionExpirationMs:<your custom value in ms>
});
First post, I'm a complete .Net/C# novice thrown in at the deep end!
I've inherited a C# wed application due to someone leaving at work and me being the only one with bandwidth! But not the .Net, C# knowledge!
The app is used by people on different sites all over the world. They log in using the corporate login details and as such they log into different servers depending on where they are located, (Europe, America or India).
The guy who wrote the app couldn't work out how to switch the ConnectionString in web.config depending on location, so duplicated the whole app for each domain! With the only variation being a single IP address in web.config for each duplicated version of the app! Then did a simple web front page which took the user to "their" version of the app depending on where they said they were in the world!
The first thing I want to do is to move to a single version to maintain, so I need to be able to switch the connection string or how to login?
I've spent several days trying to work out how I get to ConnectionString (defined in web.config) from my Login class, only to discover the values set in web.config seem to be read only, so I can't alter them.
So I guess the first question is, am I barking up the wrong tree? Can I just set all the information that AspNetActiveDirectoryMembershipProvider (see code later) requires and call it from my login class? Or is the ConnectionString route the Ipso facto way to set up connections in .Net/C#? So therefor I do need to find out how to change/specify/add the value at runtime.
Three possibilities I can think of:- (The first is the one I've ground to a hult with)
Change the ConnectionString for ADService in my web.config from my Login class?
Change what AspNetActiveDirectoryMembershipProvider uses, so from my Login class magicly get it to use EMEA_ADService or PACIFIC_ADService as defined in web.config?
Is it possible to define a new connectionString and call AspNetActiveDirectoryMembershipProvider all from my Login class, not using web.config for this connection at all?
Here's a bit of my/his web.config file and my Login class
Cuttings from Web.config
<connectionStrings>
<add name="ADService" connectionString="LDAP://12.345.67.8" /> *---- Original ConnectionString (IP address changed)----*
<add name="EMEA_ADService" connectionString="LDAP://12.345.67.8" /> *---- Added by me playing around unsuccessfully! ----*
<add name="PACIFIC_ADService" connectionString="LDAP://12.345.67.9" /> *---- Added by me playing around unsuccessfully! ----*
~
</connectionStrings>
<authentication mode="Forms">
<forms loginUrl="~/Login.aspx" timeout="2880" /> *---- The background class for this popup (Login.aspx.cs) is where I'm currently trying to affect ConnectionString----*
</authentication>
*---- Pretty sure this is the bit that actually does the login verification----*
<membership defaultProvider="AspNetActiveDirectoryMembershipProvider">
<providers>
<clear />
<add name="AspNetActiveDirectoryMembershipProvider" type="System.Web.Security.ActiveDirectoryMembershipProvider, System.Web, Version=4.0.0.0, Culture=neutral, PublicKeyToken=12345678" connectionStringName="ADService" applicationName="/." description="ADService" />
</providers>
</membership>
This is as far as I've got in my class before finding out that I don't appear to be able to alter ConnectionString!
Cuttings from Login.aspx.cs
public partial class Login : System.Web.UI.Page
{
protected void Page_Load(object sender, EventArgs e)
{
ConnectionStringSettingsCollection connections = ConfigurationManager.ConnectionStrings; //this is now working :)
string userDomain = Environment.UserDomainName; //Probably don't need this, it seems to give the login domain on the machine. Don't know yet if that will be the users machine or the server the app runs on?
if (connections.Count != 0)
{
foreach (ConnectionStringSettings connection in connections)
{
string testname = connections["ADService"].Name;
string testConnectionString = connections["ADService"].ConnectionString;
connections["ADService"].ConnectionString = "LDAP://12.345.67.9";
testConnectionString = connections["ADService"].ConnectionString;
Any hint would be very welcome!
P.S. I have requested a .Net/C# course at work! ;)
You wouldn't want to alter the existing connection string. Rather, you'd want to alter which connection string your Data Access Layer was using to call different service stacks. You could then choose a connection string at runtime based on whatever input parameters you wanted to use. which in your case might be an IP range.
asp.net mvc multiple connection strings
Handling multiple connection strings in ONE DataAccess Layer
http://msdn.microsoft.com/en-us/library/aa479086.aspx
The microsoft article is particularly interesting since it actually takes an architectural look at proper patterns for resolving dilemmas like yours. I think you got stuck with the short end of the stick! Best of luck!
The Web.config cannot be modified at Runtime. I would suggest setting some kind of flag via a login link or combobox on the website for people to use to choose where they want to login. It is not the job of the server to figure out what a user wants to do.
I am working on designing an enterprise web application which will have single codebase and single database (don't need any flexibility in database based on tenants) but different presentations based on clients. We might have 3 to 4 different clients (websites) utilizing same core logic and skeleton but client specific headers, footers, images, css etc. I need a multi-presentation solution then a full fledge multi-tenancy. Most of the samples I saw online are geared towards full fledged multi-tenancy I don't think I need that complicated stuff. I found some information here which is very useful in my case:
http://jasonjano.wordpress.com/2010/02/22/multi-presentation-websites-for-c/
As suggested in above link, I am able to identify and grab a unique ID based on the domain requested as per below configuration in my web.config file:
<configuration>
<appSettings>
<add key="MySite1.MyDomain.com" value="1"/>
<add key="www.MySite1.MyDomain.com" value="1"/>
<add key="MySite2.MyDomain.com" value="2"/>
<add key="localhost" value="1"/>
</appSettings>
</configuration>
After this, how do I dynamically select my Master page, images and css files based on the ID? Also I will be populating "CustomAppSettings" class (as suggested in article) from database, Is it advisable to make it static to it can be accessed in different layers? otherwise what is the recommended way?
Your suggestions would be very much appreciated.
This might help you with detecting the 'tenant' from the incoming request.
I would not dynamically select a different MasterPage file but rather render different content out to the MasterPage via Html Helpers / Patrial Views (or both).
Glad to see you are getting some use out of that article. Regarding the answer, I usually use a custom page class that inherits from system.web.ui.page. In the page_init of the custom page class you can set the master page, etc..
Something like (psuedo code)
class MyCustomPage : System.Web.UI.Page
{
public void Page_Init(object sender, eventargs e) {
this.MasterPageFile = CurrentSettings.MasterPageFile <Or however you are getting your masterpage file>
}
Then, in your pages, inherit from the MyCustomPage class instead of System.Web.UI.Page.
Good Luck