I am trying to setup a very basic search index, to index all items in a specific folder. I haven't really used much searching, but I'm trying to use out-of-the-box features, because its a very simple search. I just want to index all the fields. The sitecore documentation really doesn't provide much information - I've read a few blogs, and they all seem to suggest that I need the advanced database crawler (http://trac.sitecore.net/AdvancedDatabaseCrawler) - basically, something to the effect of 'it won't work without a custom crawler).
Is this right? I just want to create a simple index, and then start using it. What is the simplest way to do this, without any shared modules or otherwise? I went through the documentation on sitecore, but its not very clear (at least to me). It defines different elements of the index configuration in web.config, but doesn't really explain what they do, and what values are available. Maybe I'm not looking in the right place..
A simple way of creating new Lucene index in Sitecore with all the items below the specific node in just 3 steps:
1: Add the configuration below to the configuration/sitecore/search/configuration/indexes in Sitecore configuration:
<!-- id must be unique -->
<index id="my-custom-index" type="Sitecore.Search.Index, Sitecore.Kernel">
<!-- name - not sure if necessary but use id and forget about it -->
<param desc="name">$(id)</param>
<!-- folder - name of directory on the hard drive -->
<param desc="folder">__my-custom-index</param>
<!-- analyzer - reference to analyzer defined in Sitecore.config -->
<Analyzer ref="search/analyzer" />
<!-- list of locations to index - each of the with unique xml tag -->
<locations hint="list:AddCrawler">
<!-- first location (and the only one in this case) - specific folder from you question -->
<!-- type attribute is the crawler type - use default one in this scenario -->
<specificfolder type="Sitecore.Search.Crawlers.DatabaseCrawler,Sitecore.Kernel">
<!-- indexing itmes from master database -->
<Database>master</Database>
<!-- your folder path -->
<Root>/sitecore/content/home/my/specific/folder</Root>
</specificfolder>
</locations>
</index>
2: Rebuild the new index (only one time, all further changes will be detected automatically):
SearchManager.GetIndex("my-custom-index").Rebuild();
3: Use the new index:
// use id of from the index configuration
using (IndexSearchContext indexSearchContext = SearchManager.GetIndex("my-custom-index").CreateSearchContext())
{
// MatchAllDocsQuery will return everything. Use proper query from the link below
SearchHits hits = indexSearchContext.Search(new MatchAllDocsQuery(), int.MaxValue);
// Get Sitecore items from the results of the query
List<Item> items = hits.FetchResults(0, int.MaxValue).Select(result => result.GetObject<Item>()).Where(item => item != null).ToList();
}
Here is a pdf describing Sitecore Search and Indexing.
And here is a blog post about Troubleshooting Sitecore Lucene search and indexing.
Here is Lucene query syntax tutorial
and Introducing Lucene.Net
Sitecore Search Contrib (new name for advanced database crawler) is the best option, you just configure its config in the app config folder to tell it start path database etc.
You can then use its API to search within folders, by template type, where a certain field has a certain value. Here is a code example.
MultiFieldSearchParam parameters = new MultiFieldSearchParam();
parameters.Database = "web";
parameters.InnerCondition = QueryOccurance.Should;
parameters.FullTextQuery = searchTerm;
parameters.TemplateIds = array of pipe seperated ID's
var refinements = Filters.Select(item => new MultiFieldSearchParam.Refinement(item.Value, item.Key.ToString())).ToList();
parameters.Refinements = refinements;
//The actual Search
var returnItems = new List<Item>();
var runner = new QueryRunner(IndexName);
var skinnyItems = runner.GetItems(new[] {parameters});
skinnyItems.ForEach(x => returnItems.Add(Database.GetItem(new ItemUri(x.ItemID))));
return returnItems;
Otherwise you can just configure the web.config for standard lucene search and use this code to search. (Data base to use "web", start item etc)
public Item[] Search(string searchterms)
{
var children = new List<Item>();
var searchIndx = SearchManager.GetIndex(IndexName);
using (var searchContext = searchIndx.CreateSearchContext())
{
var ftQuery = new FullTextQuery(searchterms);
var hits = searchContext.Search(ftQuery);
var results = hits.FetchResults(0, hits.Length);
foreach (SearchResult result in results)
{
if (result.GetObject<Item>() != null)
{
//Regular sitecore item returned
var resultItem = result.GetObject<Item>();
if (ParentItem == null)
{
children.Add(resultItem);
}
else if (resultItem.Publishing.IsPublishable(DateTime.Now, false) &&
ItemUtilities.IsDecendantOfItem(ParentItem, resultItem))
{
children.Add(resultItem);
}
}
}
}
return children.ToArray();
}
Then you can download Lucene Index Viewer extension for Sitecore to view the index or you can download the Lucene Tool to view the indexes. See if you can populate the documents (files in your indexes). These are called 'Documents' in Lucene and technically these documents are content item present under the node that you specified.
Brian Pedersen has a nice post on it. You would start with a simple crawler. Need to download the Advanced Database Crawler and add the reference to your project after building it.
Then you have to create the config files which is mentioned in Brian's Blog and you have to copy as it is (except for the template id's n all). You get the point basically here.
Then you can download Lucene Index Viewer extension for Sitecore to view the index or you can download the Lucene Tool to view the indexes. See if you can populate the documents (files in your indexes). These are called 'Documents' in Lucene and technically these documents are content item present under the node that you specified.
Hope this helps!
Let me google that for you.
Related
I am trying to retrieve all of the items in a list from a SharePoint site. The fields are titled "Review Level Title", "Reviewer IDs", and "Review Level Priority". What I'm trying to do is to get the information from all three fields seperately, put them into the object I created, and then return the list with all of the objects I have created for each SharePoint item.
I have researched a lot on how to access this information from the SharePoint site, but I can not get it to work. Here is what I have created so far:
public List<OperationsReviewLevel> Get()
{
var operationsReviewLevels = new List<OperationsReviewLevel>();
ClientContext context = new ClientContext(ConfigurationManager.AppSettings["SharePointEngineeringChangeRequest"]);
var SPList = context.Web.Lists.GetByTitle("Review Levels");
CamlQuery query = new CamlQuery();
ListItemCollection entries = SPList.GetItems(query);
context.Load(entries);
context.ExecuteQuery();
foreach(ListItem currentEntry in entries)
{
operationsReviewLevels.Add(new OperationsReviewLevel(currentEntry["Review Level Title"].ToString(), currentEntry["Reviewer IDs"].ToString(), (int)currentEntry["Review Level Priority"]));
}
return operationsReviewLevels;
}
Whenever I try this code, I receive an error saying:
Microsoft.SharePoint.Client.PropertyOrFieldNotInitializedException: The property or field has not been initialized. It has not been requested or the request has not been executed. It may need to be explicitly requested.
I can not find any solutions to this error (in my scenario) online, and was wondering if anyone could see what I am doing wrong in this scenario.
Thanks everyone!
After reading the comment from Alessandra Amosso under my question, I ended up debugging entries. It took a lot of digging in the debugger, but I was able to find what the field names were being retrieved as. Debugging your ListItemCollection, if you go into Data, then any entry there, and then into FieldValues, you can see what each field value should be retrieved as.
In my case, all spaces were replaces with _x0020_ and the word priority was cut to just priorit due to length of the field name.
With this, I was able to change my foreach loop to:
foreach (ListItem currentEntry in entries)
{
operationsReviewLevels.Add(new OperationsReviewLevel(currentEntry["Review_x0020_Level_x0020_Title"].ToString(), currentEntry["Reviewer_x0020_IDs"].ToString(), Convert.ToInt32(currentEntry["Review_x0020_Level_x0020_Priorit"].ToString())));
}
And it now works properly.
Hope this helps anyone in the future!
Guess you're using SharePoint online, SharePoint online will remove the field special characters as staticname when creating fields, for example: Review Level Title will be ReviewLevelTitle.
Here is my test code.
foreach (ListItem currentEntry in entries)
{
Console.WriteLine(currentEntry["ReviewLevelTitle"].ToString()+'-'+ currentEntry["ReviewerIDs"].ToString()+'-'+ currentEntry["ReviewLevelPriority"]);
//operationsReviewLevels.Add(new OperationsReviewLevel(currentEntry["Review Level Title"].ToString(), currentEntry["Reviewer IDs"].ToString(), (int)currentEntry["Review Level Priority"]));
}
If you're not using SharePoint online,make sure the fields match also.
The host property of a familyInstance returns a RevitLinkInstance when the host is placed within a linked document. I there a way to get the real Element (or its ID) instead of the RevitLinkInstance?
I was hoping that the stableREpresentation could give me more information, but unfortunatly, it doesn't.
Reference hostFaceReference = instance.HostFace;
string stableRepresentation = hostFaceReference.ConvertToStableRepresentation(instance.Document);
this would give "ac669fa6-4686-4f47-b1d0-5d7de6a40550-000a6a4a:0:RVTLINK:234297:0:218" where 234297 is the ID of the referenced element, in this case, still the RevitLinkInstance.
Have you tried this?
ElementId hostFaceReferenceId = instance.HostFace.LinkedElementId;
You could then try getting the Element via the linkedDocument.
Document LinkedDoc = RevitLinkInstance01.GetLinkDocument();
Element linkedEl = LinkedDoc.GetElement(hostFaceReferenceId);
Depending on the host you may have to go about it a few ways. For example, with a wall you could try the following (this is using LINQ by the way):
// filter the Host's document's items
FilteredElementCollector linkdocfec = new FilteredElementCollector(elem_inst.Host.Document);
// establish the host's type
linkdocfec.OfClass(elem_inst.Host.GetType());
// find the host in the list by comparing the UNIQUEIDS
Element hostwallinlinkedfile = (from posshost in linkdocfec
where posshost.UniqueId.ToString().Equals(elem_inst.Host.UniqueId.ToString())
select posshost).First();
// check the different faces of the host (wall in this case) and select the exterior one
Reference linkrefface = HostObjectUtils.GetSideFaces((hostwallinlinkedfile as HostObject), ShellLayerType.Exterior).First<Reference>();
// create a reference to the linked face in the the CURRENT document (not the linked document)
Reference linkref = linkrefface.CreateLinkReference(rvtlink_other);
Ultimately, according to the docs anyway, you're supposed to utilize the CreateReferenceInLink method to get your item.
I'm querying Wikipedia using LinqToWiki library for c#.
In particular I want to retrieve the full image url that points to wiki page File:Cinnamomum_verum.jpg
Using the official Media Wiki API the request is: http://it.wikipedia.org/w/api.php?action=query&prop=imageinfo&iiprop=url&titles=File:Cinnamomum_verum.jpg
As you can see just by entering in a browser, the xml response contains imageinfo structure, in particular the url .
I cannot retrieve this information using LinqToWiki.
I use the following code:
var c = wiki.CreateTitlesSource("File:Cinnamomum_verum.jpg");
var source = pages
.Select(
p =>
PageResult.Create(
p.info,
p.imageinfo()
.Select(i => new { i.comment }).ToEnumerable())
).ToEnumerable();
foreach (var item in list)
{
foreach (var item2 in item.Data)
{
//retrieve all urls detected
}
}
The first foreach statement correctly retrieves one element (the page), but the inner one return none.
Anybody encountered the same problem? Am I missing anything?
You're not missing anything, I just didn't expect that pages that don't actually exist (on the wiki you're using) could have useful data. I'll try to fix this soon, but as a temporary workaround, you could query http://commons.wikimedia.org directly for images from there.
EDIT: I have updated LinqToWiki, the new version should handle imageinfo correctly.
I am implementing Sitecore Item Buckets into my project for Sitecore 6.4, .NET 2.0.
I have everything installed and all appears to be working correctly. I created an item bucket and a bucketable template. The Search UI works perfectly in Content Editor queries. I can also query the index for my items in both Sitecore Index Viewer as well as the Sitecore Rocks Index utility.
However when I try to implement the UI in code, I can never get any results no matter what method I try. I keep getting null back for every type of query I try to run. I've tried BucketManager, Search extension method, and BucketQuery all to no avail.
I also have Debug logging turned on and when I search in Content Editor via the Search tab I see logging for the query, but when my code executes no log entries are generated whatsoever. Does anyone have an idea of why this may be happening. More than happy to provide more information to try and track this down.
Item root = MasterDatabase.GetItem(Constants.ARI_BUCKET_LOCATION_ID);
var items = root.Search(out hitCount,
text: "*",
indexName: "itembuckets_buckets",
location: root.ID.ToString(),
language: "en",
startDate: "01/01/2013",
endDate: "12/31/2013",
numberOfItemsToReturn: 100,
pageNumber: 1,
templates: tmpInventoryRate.ID.ToString());
var itemresults = root.Search(out hitCount, numberOfItemsToReturn: 100, language: "en");
var results = BucketManager.Search(root, out hitCount, templates: tmpInventoryRate.ID.ToString());
var textresults = BucketManager.Search(MasterDatabase.GetItem(Constants.ARI_BUCKET_LOCATION_PATH), out hitCount, text: "OEH", location: root.ID.ToString());
var pathresults = BucketManager.Search(MasterDatabase.GetItem(Constants.ARI_BUCKET_LOCATION_PATH), out hitCount, templates: tmpInventoryRate.ID.ToString());
var queryresults = new BucketQuery().WhereTemplateIs("*").Run(root, 100);
One thing to note that I am curious about is that the above code executes in a DAL module that does not have access to Sitecore.Context but the MasterDatabase.GetItem() call does in fact retrieve the Item from the master database, but I don't know if somewhere in the Bucket API code Context is being referenced potentially?
OK, so after posting I followed down the path of my last comment about Context and using Reflector to dig into the Sitecore.ItemBucket.Kernel.Util.IndexSearcher and Sitecore.ItemBucket.Kernel.Managers.BucketManager classes I saw that both indeed references the Context item.
Part of my problem was I started with the Item Buckets on a server side processing script that was not part of the Sitecore content tree, therefore doesn't go through the processing pipeline, so that is why I did not have Context available during execution.
Buried in the answers here I found the way to set current context programatically from web.config <site> setting: Set Active Site as Context
Using that I then used the following using block to set the desired site as current Context like so:
int hitCount;
Item root = MasterDatabase.GetItem(Constants.ARI_BUCKET_LOCATION_ID);
using (new SecurityDisabler())
{
Sitecore.Context.SetActiveSite("website"); //Set Current Context
var items = root.Search(out hitCount,
text: "*",
indexName: "itembuckets_buckets",
location: root.ID.ToString(),
language: "en",
startDate: "01/01/2013",
endDate: "12/31/2013",
numberOfItemsToReturn: 100,
pageNumber: 1,
templates: "{3B0476F4-C3C4-43DD-8490-2B3FF67C368B}");
}
After making this change, I received my bucket items as expected!!
SIDE NOTE: Also just as a note for anyone who may be going down the same path following the code examples that was unrelated to this issue but something I came across while following the code samples. Make sure you have your <site name='website' ... content='master' > setting set properly in you web.config to use the Sitecore.Context.ContentDatabase.GetItem() methods as used in the developer guide and I also saw some references in the Searcher class that referenced Context.ContentDatabase as well.
Hope this saves someone else a little more time than it took me!
On a site powered by Sitecore 6.2, I need to give the user the ability to selectively exclude items from search results.
To accomplish this, I have added a checkbox field entitled "Include in Search Results", and I created a custom database crawler to check that field's value:
~\App_Config\Include\Search Indexes\Website.config:
<search>
<configuration type="Sitecore.Search.SearchConfiguration, Sitecore.Kernel" singleInstance="true">
<indexes hint="list:AddIndex">
<index id="website" singleInstance="true" type="Sitecore.Search.Index, Sitecore.Kernel">
...
<locations hint="list:AddCrawler">
<master type="MyProject.Lib.Search.Indexing.CustomCrawler, MyProject">
...
</master>
<!-- Similar entry for web database. -->
</locations>
</index>
</indexes>
</configuration>
</search>
~\Lib\Search\Indexing\CustomCrawler.cs:
using Lucene.Net.Documents;
using Sitecore.Search.Crawlers;
using Sitecore.Data.Items;
namespace MyProject.Lib.Search.Indexing
{
public class CustomCrawler : DatabaseCrawler
{
/// <summary>
/// Determines if the item should be included in the index.
/// </summary>
/// <param name="item"></param>
/// <returns></returns>
protected override bool IsMatch(Item item)
{
if (item["include in search results"] != "1")
{
return false;
}
return base.IsMatch(item);
}
}
}
What's interesting is, if I rebuild the index using the Index Viewer application, everything behaves as normal. Items whose "Include in Search Results" checkbox is not checked will not be included in the search index.
However, when I use the search index rebuilder in the Sitecore Control Panel application or when the IndexingManager auto-updates the search index, all items are included, regardless of the state of their "Include in Search Results" checkbox.
I've also set numerous breakpoints in my custom crawler class, and the application never hits any of them when I rebuild the search index using the built-in indexer. When I use Index Viewer, it does hit all the breakpoints I've set.
How do I get Sitecore's built-in indexing processes to respect my "Include in Search Results" checkbox?
I spoke with Alex Shyba yesterday, and we were able to figure out what was going on. There were a couple of problems with my configuration that was preventing everything from working correctly:
As Seth noted, there are two distinct search APIs in Sitecore. My configuration file was using both of them. To use the newer API, only the sitecore/search/configuration section needs to be set up (In addition to what I posted in my OP, I was also adding indexes in sitecore/indexes and sitecore/databases/database/indexes, which is not correct).
Instead of overriding IsMatch(), I should have been overriding AddItem(). Because of the way Lucene works, you can't update a document in place; instead, you have to first delete it and then add the updated version.
When Sitecore.Search.Crawlers.DatabaseCrawler.UpdateItem() runs, it checks IsMatch() to see if it should delete and re-add the item. If IsMatch() returns false, the item won't be removed from the index even if it shouldn't be there in the first place.
By overriding AddItem(), I was able to instruct the crawler whether the item should be added to the index after its existing documents had already been removed. Here is what the updated class looks like:
~\Lib\Search\Indexing\CustomCrawler.cs:
using Sitecore.Data.Items;
using Sitecore.Search;
using Sitecore.Search.Crawlers;
namespace MyProject.Lib.Search.Indexing
{
public class CustomCrawler : DatabaseCrawler
{
protected override void AddItem(Item item, IndexUpdateContext context)
{
if (item["include in search results"] == "1")
{
base.AddItem(item, context);
}
}
}
}
Alex also pointed out that some of my scalability settings were incorrect. Specifically:
The InstanceName setting was empty, which can cause problems on ephemeral (cloud) instances where the machine name might change between executions. We changed this setting on each instance to have a constant and distinct value (e.g., CMS and CD).
The Indexing.ServerSpecificProperties setting needs to be true so that each instance maintains its own record of when it last updated its search index.
The EnableEventQueues setting needs to be true to prevent race conditions between the search indexing and cache flush processes.
When in development, the Indexing.UpdateInterval should be set to a relatively small value (e.g., 00:00:15). This is not great for production environments, but it cuts down on the amount of waiting you have to do when troubleshooting search indexing problems.
Make sure the history engine is turned on for each web database, including remote publishing targets:
<database id="production">
<Engines.HistoryEngine.Storage>
<obj type="Sitecore.Data.$(database).$(database)HistoryStorage, Sitecore.Kernel">
<param connectionStringName="$(id)" />
<EntryLifeTime>30.00:00:00</EntryLifeTime>
</obj>
</Engines.HistoryEngine.Storage>
<Engines.HistoryEngine.SaveDotNetCallStack>false</Engines.HistoryEngine.SaveDotNetCallStack>
</database>
To manually rebuild the search indexes on CD instances, since there is no access to the Sitecore backend, I also installed RebuildDatabaseCrawlers.aspx (from this article).
I think I've figured out a halfway solution.
Here's an interesting snippet from Sitecore.Shell.Applications.Search.RebuildSearchIndex.RebuildSearchIndexForm.Builder.Build(), which is invoked by the search index rebuilder in the Control Panel application:
for (int i = 0; i < database.Indexes.Count; i++)
{
database.Indexes[i].Rebuild(database);
...
}
database.Indexes contains a set of Sitecore.Data.Indexing.Index, which do not use a database crawler to rebuild the index!
In other words, the built-in search indexer uses a completely different class when rebuilding the search index that ignores the search configuration settings in web.config entirely.
To work around this, I changed the following files:
~\App_Config\Include\Search Indexes\Website.config:
<indexes>
<index id="website" ... type="MyProject.Lib.Search.Indexing.CustomIndex, MyProject">
...
</index>
...
</indexes>
~\Lib\Search\Indexing\CustomIndex.cs:
using Sitecore.Data;
using Sitecore.Data.Indexing;
using Sitecore.Diagnostics;
namespace MyProject.Lib.Search.Indexing
{
public class CustomIndex : Index
{
public CustomIndex(string name)
: base(name)
{
}
public override void Rebuild(Database database)
{
Sitecore.Search.Index index = Sitecore.Search.SearchManager.GetIndex(Name);
if (index != null)
{
index.Rebuild();
}
}
}
}
The only caveat to this method is that it will rebuild the index for every database, not just the selected one (which I'm guessing is why Sitecore has two completely separate methods for rebuilding indexes).
Sitecore 6.2 uses the both old and newer search api, hence the differneces in how the index gets built I believe. CMS 6.5 (soon to be released) just uses the newer api - e.g., Sitecore.Search