Streaming directly to a database

Streaming directly to a database - c#

I'm using c#, and have an open tcpip connection receiving data. Is it possible to save the stream to an ms sql server database as I'm receiving it, instead of receiving all the data then saving it all? If the stream could be sent to the database as it's being received, you wouldn't have to keep the entire chunk of data in memory. Is this at all possible?

Are you writing to the DB as a BLOB, or translating the data in some form, then executing inserts for each row?
Your answer in the comments has me confused. Writing a stream to a BLOB column is vastly different then getting the data then translating it into inserts for separate rows.
Regardless, streaming into a BLOB column is possible by first creating the row with the blob column that you need to insert into, the repeatedly calling an update statement:
update myTable set myColumn.Write(#data, #offset, #length) where someid = #someId
for chunks of bytes from the stream.
Perfect example located here.

SQL Server 2005 supports HTTP endpoints (without requiring IIS) so one solution would be to create a web service on SQL Server to directly receive the streamed data.
These links explain setting one up:
http://codebetter.com/blogs/raymond.lewallen/archive/2005/06/23/65089.aspx
http://msdn.microsoft.com/en-us/library/ms345123.aspx

See here and here for exmaples of working with streams and databases. Essentially, you need to pass repeated buffers (ideally of multiples of 8040 bytes in SQL Server). Note that the examples are based on SQL 2000; with SQL 2005, varbinary(max) would be easier. Very similar, though.

Why not just write the buffer to a file as you receive packets, then insert to the database when the transfer is complete?
If you stream directly to the database, you'll be holding a connection to the database open a long time, especially if the client's network connection isn't the best.

Related

Receive multipart response and treat each part as soon as received

Current situation: an existing SQL Server stored procedure I have no control upon returns 10 large strings in separate resultsets in about 30 seconds (~3 seconds per dataset). The existing ASP.NET Web API controller method that collects these strings only returns a response once all strings are obtained from the stored procedure. When the client receives the response, it takes another 30 seconds to process the strings and display the results, for a total of 1 minute from request initiation to operation completion.
Contemplated improvement: somehow transmit the strings to the client as soon as each is obtained from the SqlDataReader, so the client can work on interpreting each string while receiving the subsequent ones. The total time from request initiation to completion would thus roughly be halved.
I have considered the WebClient events at my disposal, such as DownloadStringCompleted and DownloadProgressChanged, but feel none is viable and generally think I am on the wrong track, hence this question. I have all kinds of ideas, such as saving strings to temporary files on the server and sending each file name to the client through a parallel SignalR channel for the client to request in parallel, etc., but feel I would both lose my time and your opportunity to enlighten me.

I would not resort to inverting the standard client / server relationship using a "server push" approach. All you need is some kind of intermediary dataset. It could be a singleton object (or multiple objects, one per client) on your server, or another table in an actual database (perhaps NoSql).
The point is that the client will not directly access the slow data flow you're dealing with. Instead the client will only access the intermediary dataset. On the first request, you will start off the process of migrating data from the slow dataset to the intermediary database and the client will have to wait until the first batch is ready.
The client will then make additional requests as he processes each result on his end. If more intermediary results are already available he will get them immediately, otherwise he will have to wait like he did on the first request.
But the server is continuously waiting on the slow data set and adding more data to the intermediate data set. You will have to have a way of marking the intermediate data as having already been sent to the client or not. You will probably want to spawn a separate thread for the code that moves data from the slow data source to the intermediate one.

U-SQL data source as SQL Server inside a UDF

I need to extract rows from a SQL table where some columns are encrypted using SQL Server's new 'Always Encrypted' feature. I see that I cannot use the 'AZURESQLDB' DataSource feature and there needs to be decryption done before reading the data in plain text. Are there plans to add this capability?. Meanwhile, I tried to write a user defined function that will do the same operation(connect, decrypt data and return object) in a registered assembly but when it runs, I get the following error:
Inner exception from user expression: A network-related or instance-specific error occurred while establishing a connection to SQL Server. The server was not found or was not accessible. Verify that the instance name is correct and that SQL Server is configured to allow remote connections. (provider: Named Pipes Provider, error: 40 - Could not open a connection to SQL Server)
I have checked the code and everything seems correct. The connection string is used by the SqlConnection object and works fine in all other applications. I am guessing that the connectivity to external data sources from within a UDF is blocked. Is there any way around this?

Are you using the DATA SOURCE in U-SQL for representing your SQL Server instance and you cannot get it to read encrypted data? If so, please file a feature request at http://aka.ms/adlfeedback.
You cannot call out to network resources directly from within U-SQL user code for the reasons explained here.

One way around this might be to create a stored procedure which does the hard work, the decryption then renders the data. Then use Azure Data Factory with a Stored Proc Task to access the decrypted data and move what you need to the Data Lake - not including the secure data. From there you could then access it using a U-SQL script. One idea? Let me know if you need me to work up more of an example.

OPCDA read time

I am polling an OPCDA server for data every second. I use the standard .NET DLL's from OPC Foundation to achieve this.
My service is located on the same server as the OPCDA server. However, my read times are often around 900-1000ms. Is this normal or something wrong in my code or server setup? I poll around 20 OPCDA tags. What is a "standard" response time of such an operation or is it impossible to say?

It doesn't sound normal, but it's impossible to say for certain without knowing what the source of the data is.

Check documentation of OPC DA interface you use to fetch data from the server and what parameters you pass to it.
If you use synchronous reads then problem definitely on server side or its backend (meaning that it takes too much time for server to read actual data).
If you use asynchronous reads (subscriptions) check parameter named like 'update rate'. It defines how often new data will be sent to client. E.g. if it is 1 second client will receive new data NOT faster than 1 second.
Subscriptions are supported by all OPC DA versions. If server doesn't implement this interface you will not be able to read asynchronously and will get error code like 'not implemented'.

What OPC server are you using? There may be a setting to keep the update rate fixed or respect the client update rate.

Data processing at the client or server

I need your suggest about data processing.
My server is a data server (using SQL Server 2005). And my client will get data from the server, and display them on windows.
Server and client is on internet (not LAN). So, time to get client is depended on the data's size and internet speed.
Assume: the SQL Server has a table with 2 column (Value and Change), the client will get data from this table (store in a datatable) and display them on a datagridview with 3 columns: Value, Change, and ChangePercent.
Note: ChangePercent = Change/Value;
I have a question: data in ChangePercent field should be calculated at server or client?
If I do at the server, the server will be overhead if there are a lot of clients. Moreover, the data returns to clients is greater (data of 3 fields).
If I do on the client, the client will only get data with 2 fields (Value & Change). Data in column ChangePercent will be calculated at client.
P/S: the connection between client and server is across a .net remoting. Client is a winform C# 2.0.
Thanks.

Go with calculation on the client.
Almost certainly the calculation will be faster than it takes to get the extra field over the line, apart from the fact that business logic shouldn't be calculated on a database server anyway.
Assuming that all variables will be of the same type, you needlessly increase your data transfer with 33% when calculating on the server. This matters only for large result sets obviously.

I don't think it matters where you do it, a division operation won't be too much of an overhead for either the server or the client. But consider that you have to write code on the client to handle a very simple operation that can be handled on the server.
EDIT: you can make a test table with, say, 1.000.000 records and see the actual execution time with the division and without it.

I would suggest using Method #2: Send 2 fields and let the third being calculated by the client.
The relative amount of calculation is very small for the client.

How many requests can SQL Server handle per second?

I am using JMeter to test our application 's performance. but I found when I send 20 requests from JMeter, with this the reason result should be add 20 new records into the sql server, but I just find 5 new records, which means that SQL server discard the other requests(because I took a log, and make sure that the insert new records are sent out to sql server.)
Do anyone have ideas ? What's the threshold number of request can SQL server handle per second ? Or do i need to do some configuration ?
Yeah, in my application, I tried, but it seems that only 5 requests are accepted, I don't know how to config , then it can accept more.

I'm not convinced the nr of requests per seconds are directly releated to SQL server throwing away your inserts. Perhaps there's an application logic error that rolls back or fails to commit the inserts. Or the application fails to handle concurrency and inserts data violating the constraints. I'd check the server logs for deadlocks as well.

Use either SQL Profiler or the LINQ data context for logging to see what has actually been sent to the server and then determine what the problem is.
Enable the data context log like this:
datacontext.Log = Console.Out;
As a side note, I've been processing 10 000 transactions per second in SQL Server, so I don't think that is the problem.

This is very dependent on what type of queries you are doing. You can have many queries requesting data which is already in a buffer, so that no disk read access is required or you can have reads, which actually require disk access. If you database is small and you have enough memory, you might have all the data in memory at all times - access would be very fast then, you might get 100+ queries/second. If you need to read a disk, you are dependant an you hardware. I have opted for an UltraSCSI-160 controller with UltraSCSI-160 drives, the fastest option you can get on a PC type platform. I process about 75'000 records every night (they get downloaded from another server). For each record I process, the program makes about 4 - 10 queries to put the new record into the correct 'slot'. The entire process takes about 3 minutes. I'm running this on an 850 MHz AMD Athlon machine with 768 MB of RAM.
Hope this gives you a little indication about the speed.

This is an old case of study, now there is 2017, and 2019 I am waiting to see what will happens
https://blogs.msdn.microsoft.com/sqlcat/2016/10/26/how-bwin-is-using-sql-server-2016-in-memory-oltp-to-achieve-unprecedented-performance-and-scale/
SQL Server 2016 1 200 000 batch requests/sec Memory-Optimized Table with LOB support, Natively Compiled stored procedures

To get benchmark tests for SQL Server and other RDBMS, visit the Processing Performance Council Web

You can also use Sql Server profile to check how your querys are executed

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Streaming directly to a database - c#

See here and here for exmaples of working with streams and databases. Essentially, you need to pass repeated buffers (ideally of multiples of 8040 bytes in SQL Server). Note that the examples are based on SQL 2000; with SQL 2005, varbinary(max) would be easier. Very similar, though.

Why not just write the buffer to a file as you receive packets, then insert to the database when the transfer is complete? If you stream directly to the database, you'll be holding a connection to the database open a long time, especially if the client's network connection isn't the best.

Related

Receive multipart response and treat each part as soon as received

U-SQL data source as SQL Server inside a UDF

OPCDA read time

Data processing at the client or server

How many requests can SQL Server handle per second?

Categories

Resources