How to do content level search in AmazonS3 - c#

I have some files(.txt, .doc, .xlsx etc) inside a bucket in my AmazonS3 drive and is it possible to perform a content level search through my C# application? That is, when we type a string and upon pressing key in my application, every files that contains the searched string in its content should list.
Is there any way to achieve this either using any method or even using WebAPI's.
Thanks in advance

Amazon S3 is purely a storage service. There is no search capability built into S3.
You could use services such as Amazon CloudSearch and Amazon Elasticsearch Service, which can index documents, but please note that this involves additional configuration and additional costs.

You won't be able to do all those file types you listed, but any of your files that are structured, or semi-structured, you could consider using the newly released AWS Athena which does allow searching of S3 file using an SQL-like language:
https://aws.amazon.com/athena/faqs/
Amazon Athena is an interactive query service that makes it easy to
analyze data in Amazon S3 using standard SQL. Athena is serverless, so
there is no infrastructure to setup or manage, and you can start
analyzing data immediately. You don’t even need to load your data into
Athena, it works directly with data stored in S3. To get started, just
log into the Athena Management Console, define your schema, and start
querying. Amazon Athena uses Presto with full standard SQL support and
works with a variety of standard data formats, including CSV, JSON,
ORC, Apache Parquet and Avro. While Amazon Athena is ideal for quick,
ad-hoc querying and integrates with Amazon QuickSight for easy
visualization, it can also handle complex analysis, including large
joins, window functions, and arrays.

Related

Copy .csv file from Azure Blob Storage to Sharepoint site

I have a CSV file stored in blob storage. The goal is to move this file into a Sharepoint site and set some metadata. What would be the best way to do this? The client does not want us to use Power Automate or Logic Apps.
I tried using Azure Data Factory but there seems to be an issue with writing data to SharePoint. I used the copy activity but the 'sink' to SharePoint failed. Does data factory support writing to Sharepoint?
The client does not want us to use Power Automate or Logic Apps.
Why not? This is the simplest way to achieve this, and is also better maintainable than for instance C# code.
Does data factory support writing to Sharepoint?
Yes, it does. However, using Data Factory only to copy a file to SharePoint is quite a bit of overkill.
If Logic Apps are not an option, have a look at an Azure Function to automatically trigger when the file is created in Azure Storage, and have a look at for instance Upload File To SharePoint Office 365 Programmatically Using C# CSOM – PNP for a C# way of uploading a file to SharePoint.

Upload a file to AWS S3 using C# without using AWS SDK

I need to create a C# SQL CLR stored procedure to upload files (data exports) to AWS S3 buckets. These files will generally be very small.
The AWS SDK cannot be installed on the SQL Servers and I am finding it difficult to find any information about how to accomplish this.
I am looking for some examples or documentation on how to accomplish uploading files without using the SDK.
My experience is mainly SQL, limited amount of C#.
You can use Amazon S3 via a REST API: Amazon S3 REST API Introduction
However, it can get a little complex, especially when providing Authentication signatures.

Cosmos DB Attachment Limits and Alternate Attachment Locations

We're moving the data storage for our core product to Cosmos DB. For documents, it works very well but I'm having trouble finding the information I need for attachments.
I can successfully do everything I want with attachments from my C# code using the Microsoft.Azure.DocumentDB NuGet ackage v 1.19.1.
According to information I can find, attachments are limited to 2GB total for all attachments in an account. This is hugely limiting. Info found here:
https://learn.microsoft.com/en-us/azure/cosmos-db/sql-api-resources#attachments-and-media
It states:
Azure Cosmos DB allows you to store binary blobs/media either with Azure Cosmos DB (maximum of 2 GB per account) or to your own remote media store.
There seems to be some implication that you can create attachments that point to resources stored elsewhere. Perhaps on a CDN. But I can't find any documentation how to actually do this from C#.
Does anyone know if Cosmos DB can, in fact, attach to BLOB payloads stored outside of itself? If so, can the .NET NuGet package do it or is it only available for pure REST calls?
Many thanks in advance.
There's nothing inherently built-in to manage externally-stored attachments. Rather, it's up to you to store them and then reference them.
The most common pattern is to store a URL to the specific attachment, with a document (e.g. to a blob in Azure Storage). This results in effectively two operations:
A query to retrieve the document from Cosmos DB
A read from storage, based on the URL found in the returned Cosmos DB document.
Note: all responsibility is on you to manage referenced content: updating it, deleting it, etc. And if you're using blob storage, you'll need to deal with things such as private vs public access (and generating SAS for private URLs where necessary, when returning URLs to your clients, vs streaming content).
One more thing: CDN isn't a storage mechanism on its own. You cannot store something directly to CDN; that's more of a layer on top of something like Azure Storage (for public-accessible content).

Retrieving the words from WordNet database

I’m looking for a website that offers API for retrieving the words from English WordNet database.
I do not want to download the WordNet database and implement it in my server.
Simply I want to call API and get back some results in XML format from that web site.
I have a web application in ASP.net that is written in C#.
Here there is a sample from WordNet, I want to do something like that in my web application.
WordNet Online
It seems that is no such API publicly available.
According to Related Projects site part of WordNet data is avaible as API via abbreviations.com:
Abbreviations.com has created free APIs based on REST calls which return a well-formatted XML result, providing both synonyms and definitions APIs based on the WordNet database.
However on the same page in .NET/C# section you can find some publicly available local APIs, so you don't have to implement it by yourself, but have to download data files.
WordNet does not seem to expose a REST or similar API that can be used. That said, you might be able to derive the URL pattern by searching online and using that in your application and parsing the response html.
You might want to check there website to make sure this is legal.

Turnkey CDN solutions?

I have a C# application that generates simple JPEG images. I need to be able to store these images and recall them at various times in the future. So, I'm looking for a turnkey, secure, CDN system. I have hacked my own together with a Windows server and IIS - I upload via FTP and request images over HTTP - but (1) there's, effectively, no need for it to be Windows and (2) its not very cost effective. I'll be generating approximately 1-2GB of images each month and I need to hold the images in perpetuity.
What are some of the turnkey options for storing this many images?
I suggest storing the images on Amazon S3. It's stable, widely supported, and can plug into a variety of workflows and security models. As of August 2011, pricing starts at $0.09/GB/month for storage, and $0.12/GB for transfer (with the first GB per month free).
While many people use S3 as a cheap and good-enough CDN, Amazon also offers Amazon Cloudfront, a "real" CDN that integrates neatly with S3.
Amazon maintains an official C# library that can talk to S3 and CloudFront, the AWS SDK for .NET.
I'm a fan of NetDNA http://www.netdna.com/ . I currently use them - good customer service, and inexpensive. Also, it is easy to plugin into Wordpress.
Check out Amazon Cloudfront http://aws.amazon.com/cloudfront/
It's their CDN product built on S3. You can use the available C# libraries, examples here http://aws.amazon.com/code/Amazon-S3/129

Categories

Resources