Xamarin.forms Highlight URL from text - c#

I have Xamarin.Forms app which identify URLs from label and provide click to the URL using span gestures and show in a webview.
The sample text which I bind to the label from API result is like this:
test
Test description
Test description2
Files :
https://notification-assets.com/1586370078029
https://notification-assets.com/1586370078037
https://notification-assets.com/1586370078063
How I am fetching URLs from this text:
The urlStr is my text conatins URLS
string[] words = urlStr.Split(' ').ToArray();
formattedString = new FormattedString();
foreach (string str in words)
{
if (IsUrl(str))
{
Span span = new Span() { Text = str + " ", TextColor = Color.FromHex("#261fee"), TextDecorations = TextDecorations.Underline, FontAttributes = FontAttributes.Italic };
span.GestureRecognizers.Add(new TapGestureRecognizer()
{
NumberOfTapsRequired = 1,
Command = new Command(async () => {
await PopupNavigation.Instance.PushAsync(new WebviewPopup(span.Text));
})
});
formattedString.Spans.Add(span);
}
else
{
Span span = new Span() { Text = str + " ", TextColor = Color.Gray };
formattedString.Spans.Add(span);
}
}
private static bool IsUrl(string url)
{
string pattern = #"((https?|ftp|file)\://|www.)[A-Za-z0-9\.\-]+(/[A-Za-z0-9\?\&\=;\+!'\(\)\*\-\._~%]*)*";
Regex reg = new Regex(pattern, RegexOptions.Compiled | RegexOptions.IgnoreCase);
return reg.IsMatch(url);
}
This will highlight the three URLS, but in My webview I will get the URLS like this
":\nhttps://notification-assets.com/1586370078029\nhttps://notification-assets.com/1586370078037\nhttps://notification-assets.com/1586370078063\n "
First of all I don't want to get the 3 URLs even if we are clicking one of the URLs.
Secondly how to remove the /n and : from the formatted URL? How to solve this? Any help is appreciated.

Related

How to highlight only results of PrefixQuery in Lucene and not whole words?

I'm fairly new to Lucene and perhaps doing something really wrong, so please correct me if it is the case. Being searching for the answer for a few days now and not sure where to go from here.
The goal is to use Lucene.NET to search for user names with partial search (like StartsWith) and highlight only the found parts. For instance if I search for abc in a list of ['a', 'ab', 'abc', 'abcd', 'abcde'] it should return just the last three in a form of ['<b>abc</b>', '<b>abc</b>d', '<b>abc</b>de']
Here is how I approached this.
First the index creation:
using var indexDir = FSDirectory.Open(Path.Combine(IndexDirectory, IndexName));
using var standardAnalyzer = new StandardAnalyzer(CurrentVersion);
var indexConfig = new IndexWriterConfig(CurrentVersion, standardAnalyzer);
indexConfig.OpenMode = OpenMode.CREATE_OR_APPEND;
using var indexWriter = new IndexWriter(indexDir, indexConfig);
if (indexWriter.NumDocs == 0)
{
//fill the index with Documents
}
The documents are created like this:
static Document BuildClientDocument(int id, string surname, string name)
{
var document = new Document()
{
new StringField("Id", id.ToString(), Field.Store.YES),
new TextField("Surname", surname, Field.Store.YES),
new TextField("Surname_sort", surname.ToLower(), Field.Store.NO),
new TextField("Name", name, Field.Store.YES),
new TextField("Name_sort", name.ToLower(), Field.Store.NO),
};
return document;
}
The search is done like this:
using var multiReader = new MultiReader(indexWriter.GetReader(true)); //the plan was to use multiple indexes per entity types
var indexSearcher = new IndexSearcher(multiReader);
var queryString = "abc"; //just as a sample
var queryWords = queryString.SplitWords();
var query = new BooleanQuery();
queryWords
.Process((word, index) =>
{
var boolean = new BooleanQuery()
{
{ new PrefixQuery(new Term("Surname", word)) { Boost = 100 }, Occur.SHOULD }, //surnames are most important to match
{ new PrefixQuery(new Term("Name", word)) { Boost = 50 }, Occur.SHOULD }, //names are less important
};
boolean.Boost = (queryWords.Count() - index); //first words in a search query are more important than others
query.Add(boolean, Occur.MUST);
})
;
var topDocs = indexSearcher.Search(query, 50, new Sort( //sort by relevance and then in lexicographical order
SortField.FIELD_SCORE,
new SortField("Surname_sort", SortFieldType.STRING),
new SortField("Name_sort", SortFieldType.STRING)
));
And highlighting:
var htmlFormatter = new SimpleHTMLFormatter();
var queryScorer = new QueryScorer(query);
var highlighter = new Highlighter(htmlFormatter, queryScorer);
foreach (var found in topDocs.ScoreDocs)
{
var document = indexSearcher.Doc(found.Doc);
var surname = document.Get("Surname"); //just for simplicity
var surnameFragment = highlighter.GetBestFragment(standardAnalyzer, "Surname", surname);
Console.WriteLine(surnameFragment);
}
The problem is that the highlighter returns results like this:
<b>abc</b>
<b>abcd</b>
<b>abcde</b>
<b>abcdef</b>
So it "highlights" entire words even though I was searching for partials.
Explain returned NON-MATCH all the way so not sure if it's helpful here.
Is it possible to highlight only the parts which were searched for? Like in my example.
While searching a bit more on this I came to a conclusion that to make such highlighting work one needs to tweak index generation methods and split indices by parts so offsets would be properly calculated. Or else highlighting will highlight only surrounding words (fragments) entirely.
So based on this I've managed to build a simple highlighter of my own.
public class Highlighter
{
private const string TempStartToken = "\x02";
private const string TempEndToken = "\x03";
private const string SearchPatternTemplate = $"[{TempStartToken}{TempEndToken}]*{{0}}";
private const string ReplacePattern = $"{TempStartToken}$&{TempEndToken}";
private readonly ConcurrentDictionary<HighlightKey, Regex> _regexPatternsCache = new();
private static string GetHighlightTypeTemplate(HighlightType highlightType) =>
highlightType switch
{
HighlightType.Starts => "^{0}",
HighlightType.Contains => "{0}",
HighlightType.Ends => "{0}$",
HighlightType.Equals => "^{0}$",
_ => throw new ArgumentException($"Unsupported {nameof(HighlightType)}: '{highlightType}'", nameof(highlightType)),
};
public string Highlight(string text, IReadOnlySet<string> words, string startToken, string endToken, HighlightType highlightType)
{
foreach (var word in words)
{
var key = new HighlightKey
{
Word = word,
HighlightType = highlightType,
};
var regex = _regexPatternsCache.GetOrAdd(key, _ =>
{
var parts = word.Select(w => string.Format(SearchPatternTemplate, Regex.Escape(w.ToString())));
var pattern = string.Concat(parts);
var highlightPattern = string.Format(GetHighlightTypeTemplate(highlightType), pattern);
return new Regex(highlightPattern, RegexOptions.IgnoreCase | RegexOptions.CultureInvariant | RegexOptions.Compiled);
});
text = regex.Replace(text, ReplacePattern);
}
return text
.Replace(TempStartToken, startToken)
.Replace(TempEndToken, endToken)
;
}
private record HighlightKey
{
public string Word { get; init; }
public HighlightType HighlightType { get; init; }
}
}
public enum HighlightType
{
Starts,
Contains,
Ends,
Equals,
}
Use it like this:
var queries = new[] { "abc" }.ToHashSet();
var search = "a ab abc abcd abcde";
var highlighter = new Highlighter();
var outputs = search
.Split((string[])null, StringSplitOptions.RemoveEmptyEntries | StringSplitOptions.TrimEntries)
.Select(w => highlighter.Highlight(w, queries, "<b>", "</b>", HighlightType.Starts))
;
var result = string.Join(" ", outputs).Dump();
Util.RawHtml(result).Dump();
Output looks like this:
a ab <b>abc</b> <b>abc</b>d <b>abc</b>de
a ab abc abcd abcde
I'm open to any other better solutions.

Set Bold to a Paragraph in GemBox Document ASP.Net c#

I am using GemBox.Document library in my ASP.Net Page. I have a paragraph which contains line breaks and I also need to set the paragraph to bold.
I my below code the variable str contains line break characters.
TRY 1
Line breaks work well in the below code
var p3 = new Paragraph(wDoc, str);
How to set BOLD to this paragraph
TRY 2
Bold work well in the below code
var p3 = new Paragraph(wDoc,
new Run(wDoc, str) { CharacterFormat = { Bold = true } }
);
This doesn't allow line breaks
Please help for a solution
Probably the easiest way to do this is something like this:
var paragraph = new Paragraph(wDoc);
paragraph.Content.LoadText(str, new CharacterFormat() { Bold = true });
Or this:
var paragraph = new Paragraph(wDoc);
paragraph.CharacterFormatForParagraphMark.Bold = true;
paragraph.Content.LoadText(str);
But just in case you're interested, the thing to note here is that line breaks are represented with SpecialCharacter objects, not with Run objects.
So the following would be the "manual" way in which you would need to handle those breaks yourself, you would need to add the correct elements to the Paragraph.Inlines collection:
string str = "Sample 1\nSample 2\nSample 3";
string[] strLines = str.Split('\n');
var paragraph = new Paragraph(wDoc);
for (int i = 0; i < strLines.Length; i++)
{
paragraph.Inlines.Add(
new Run(wDoc, strLines[i]) { CharacterFormat = { Bold = true } });
if (i != strLines.Length - 1)
paragraph.Inlines.Add(
new SpecialCharacter(wDoc, SpecialCharacterType.LineBreak));
}
That is the same as if you were using this Paragraph constructor:
var paragraph = new Paragraph(wDoc,
new Run(wDoc, "Sample 1") { CharacterFormat = { Bold = true } },
new SpecialCharacter(wDoc, SpecialCharacterType.LineBreak),
new Run(wDoc, "Sample 2") { CharacterFormat = { Bold = true } },
new SpecialCharacter(wDoc, SpecialCharacterType.LineBreak),
new Run(wDoc, "Sample 3") { CharacterFormat = { Bold = true } });

I want to load a text file into a data grid and apply a regular expression

string[] s = File.ReadAllLines(ofdl.FileName);
List<code> codes = new List<code>();
string textfile = ofdl.FileName;
var textvalues = s;
foreach (var item in textvalues)
{
codes.Add(new code() { Value = RemoveEmptyLines(item) });
}
dataGrid.ItemsSource = codes;
under_label.Content = textfile;
under_label1.Content = codes.Count();
private string RemoveEmptyLines(string lines)
{
return lines = Regex.Replace(lines, #"\n\s.+", "");
}
I want to load a text file into a data grid and apply a regular expression
but this code don't work for me
you don't need Regex to search for empty strings. String.IsNullOrWhiteSpace() method will do.
string[] lines = File.ReadAllLines(ofdl.FileName);
var codes = lines.Where(s => !String.IsNullOrWhiteSpace(s)).ToList();
dataGrid.ItemsSource = codes;
under_label.Content = ofdl.FileName;
under_label1.Content = codes.Count;
datagrid
I want to exclude the part of the circle separately
List<code> codes = new List<code>();
string[] s = File.ReadAllLines(ofdl.FileName);
string textfile = ofdl.FileName;
var textvalues = s;
foreach (var item in textvalues)
{
codes.Add(new code() { Value = item});
}
dataGrid.ItemsSource = codes;
}
}
}
private void streams()
{
}
private string RemoveEmptyLines(string lines)
{
return lines = Regex.Replace(lines, #"\n\s.+", "");
}

How can i ignore replace text within specific two symbol

I used following code snippet to replace text
private void textBox1_TextChanged(object sender, EventArgs e)
{
string A = textBox1.Text.Trim();
string B = textBox1.Text.Trim();
A = A.Replace("AB", "CD");
A = A.Replace("GF", "HI");
A = A.Replace("AC", "QW");
A = A.Replace("VB", "GG");
textBox2.Text = (A);
}
but i wants to ignore this replace technique within || these symbol.As a example my code do this
when i type AB GF in a txtbox1,txtbox2 replace as following CD HI.
Now i need when i type |AB GF| in txtbox1 ,txtbox2 replace as AB GF
i used this code to do this
textBox2.Text = ((B.Contains("|")) ? B.Replace("|", "") : A);
but this isn't work,after | this symbol all containing things in txtbox1 not replaced,how can i do this
Per your comments, you will want to split your string on the spaces prior to doing the replacement. Afterwards you will join it all back together. This is pretty easy with Linq.
public Main()
{
var strings = new string[]{ "AB GF", "|AB| GF" };
foreach (var s in strings)
Console.WriteLine(String.Join(" ", s.Split(' ').Select(x => ReplaceText(x))));
}
string ReplaceText(string text)
{
if (text.Contains("|"))
return text.Replace("|", String.Empty);
else
{
text = text.Replace("AB", "CD");
text = text.Replace("GF", "HI");
text = text.Replace("AC", "QW");
return text.Replace("VB", "GG");
}
}
Prints:
CD HI
AB HI
Looking at your code. If you need to avoid a ReplaceText method. Something like this would work.
string A = textBox1.Text.Trim();
var subStrings = A.Split(' ');
for (int i = 0; i < subStrings.Count(); i++)
{
if (subStrings[i].Contains("|"))
subStrings[i] = subStrings[i].Replace("|", String.Empty);
else
{
subStrings[i] = subStrings[i].Replace("AB", "CD");
subStrings[i] = subStrings[i].Replace("GF", "HI");
subStrings[i] = subStrings[i].Replace("AC", "QW");
subStrings[i] = subStrings[i].Replace("VB", "GG");
}
}
textBox2.Text = String.Join(" ", subStrings);

highlighting text in Docx using c#

I need to highlight a sentence in docx file, I have this code, and its working fine for many documents , but i noticed that for some document the text inside the document is set word by word, not whole sentence, I mean each word with its own Run, so when searching for that sentence, it is not found because it is word by word in the docx.
NOTE: I am working with Arabic text.
private void HighLightText_userSentence(Paragraph paragraph, string text, string title, string author, decimal percentage, string _color)
{
string textOfRun = string.Empty;
var runCollection = paragraph.Descendants<Run>();
Run runAfter = null;
//find the run part which contains the characters
foreach (Run run in runCollection)
{
if (run.GetFirstChild<Text>() != null)
{
textOfRun = run.GetFirstChild<Text>().Text.Trim();
if (textOfRun.Contains(text))
{
//remove the character from thsi run part
run.GetFirstChild<Text>().Text = textOfRun.Replace(text, "");
runAfter = run;
break;
}
}
}
// create a new run with your customization font and the character as its text
Run HighLightRun = new Run();
RunProperties runPro = new RunProperties();
RunFonts runFont = new RunFonts() { Ascii = "Curlz MT", HighAnsi = "Curlz MT" };
Bold bold = new Bold();
DocumentFormat.OpenXml.Wordprocessing.Color color = new DocumentFormat.OpenXml.Wordprocessing.Color() { Val = _color };
DocumentFormat.OpenXml.Wordprocessing.FontSize fontSize = new DocumentFormat.OpenXml.Wordprocessing.FontSize() { Val = "22" };
FontSizeComplexScript fontSizeComplex = new FontSizeComplexScript() { Val = "24" };
Text runText = new Text() { Text = text };
//runPro.Append(runFont);
runPro.Append(bold);
runPro.Append(color);
//runPro.Append(fontSize);
// runPro.Append(fontSizeComplex);
HighLightRun.Append(runPro);
HighLightRun.Append(runText);
//HighLightRun.AppendChild(new Break());
//HighLightRun.PrependChild(new Break());
//insert the new created run part
paragraph.InsertBefore(HighLightRun, runAfter);
}
I recently used docX and was facing problems with searching and higlighting text. I tried an indirect way. It simple and works in most situations. I do it using the replace statement.
here search text is the text you want to highlight
using (DocX doc = DocX.Load("d:\\Sample.docx"))
{
for (int i = 0; i < doc.Paragraphs.Count; i++)
{
foreach (var item in doc.Paragraphs[i])
{
if (doc.Paragraphs[i] is Paragraph)
{
Paragraph sen = doc.Paragraphs[i] as Paragraph;
Formatting form = new Formatting();
form.Highlight = Highlight.yellow;
form.Bold = true;
sen.ReplaceText(searchText, searchText, false,
System.Text.RegularExpressions.RegexOptions.IgnoreCase,
form, null, MatchFormattingOptions.ExactMatch);
}
}
}
doc.Save();
}

Categories

Resources