Kafka consumer `fetch.min.bytes` not working as expected - c#

I wanted to consume more than 1 message at a time so configured the property fetch.min.bytes but still ended up consuming one message at a time.
There are more than 50k messages in the Kafka topic(6 partitions). Each message is of approx. 10KB and the property fetch.min.bytes is to set to 100KB (100000 bytes).
I am expecting the returned count of records should be more than 1.
Using the .net Library for consumer. Using this function to consume the data
https://docs.confluent.io/5.5.0/clients/confluent-kafka-dotnet/api/Confluent.Kafka.IConsumer-2.html#Confluent_Kafka_IConsumer_2_Consume_CancellationToken_
I am also not able to understand the returned type of .consume(CancellationToken) is ConsumeResult<TKey, TValue> how can I accept List<ConsumeResult<TKey, TValue>> as there is not such function available with List<ConsumeResult<TKey, TValue>> as return type.

Related

How to get the number of elements in a Channel<T>?

I am planning a subscribers and publishers system where I plan to use Channels. I would like to log the number of elements in each Channel I use so I can adjust the number of publishers/subs based on the bottlenecks in my system.
I started to wrap Channel, ChannelReader and ChannelWriter into my own classes to count the number of writes and reads but this feels like a hack. Is there a better way?
Use the source, Luke. The source tells you that (a) there is no public API to do that, (b) you can use reflection to get the value of the private property ItemsCountForDebugger on both bounded and unbounded channels, and (c) this is safe despite no locking in the getter. Of course this is a hack. Whether the reflection hack is better than wrapper class hack is a question of taste. Public API to get the approximate number of elements in a Channel<T> was requested back in 2018, and will be added in .NET Core 5.0 (slated for release in November).

stackexchange redis hashscan return all fields in one time

I'm using stackexchange.redis SDK in C#, and wish to scan my hash set.
I expected the SDK executed as redis client(when I execute "hscan myKey 0", it will return several key-value pairs, and an cursor which I'll use for the next scan). But when I use stackexchange.redis SDK to implement the "hashscan" method as following:
redisCache.HashScan(myKey, pageSize:10, cursor: 0)
It will return all the fields in "myKey", there are 2,000 key-value pairs in it.
How can I let it just return several results at one time?
Cause In the future, there will be millions of fields in "myKey", if they all return at one time, it'll cost lots of memory, and will it block the online service? Cause redis is single thread application.
Thanks!
It isn't doing quite what you think it is doing. The HashScan method here returns a custom iterator which maintains at most 2 pages of data; when you get near the end of one page, it fetches the next page automatically. Essentially, then, if you only want to read 20 items, just read 20 items. For example, LINQs .Take(20) would work fine. If you call .ToList() on the iterator, then yes: it will walk from one end to the other, fetching data dynamically as it needs. So: don't do that :)
Things it does not do:
fetch all the data in a single huge call to redis
perform lots of small calls to redis before returning from the HashScan method
As a side note: the custom iterator implements a custom interface to allow you to pick up and resume cursors, if you need that.

Identifying a property name with a low footprint

I wish to send packets to sync properties of constantly changing game objects in a game. I've sent notifications of when a property changes on the server side to a EntitySync object that is in charge of sending out updates for the client to consume.
Right now, I'm pre-fixing the property string name. This is a lot of overhead for when you're sending a lot of updates (position, HP, angle). I'd like for a semi-unique way to idneity these packets.
I thought about attributes (reflection... slow?), using a suffix on the end and sending that as an ID (Position_A, HP_A) but I'm at a loss of a clean way to identify these properties quickly with a low foot print. It should consume as few bytes as possible.
Ideas?
Expanding on Charlie's explanation,
The protobuf-net library made by Marc Gravell is exactly what you are looking for in terms of serialization. To clarify, this is Marc Gravell's library, not Googles. It uses Google's protocol buffer encoding. It is one of the smallest footprint serializes out there, in fact it will likely generate smaller packets than you manually serializing it will ( How default Unity3D handles networking, yuck ).
As for speed, Marc uses some very clever trickery (Namely HyperDescriptors) http://www.codeproject.com/Articles/18450/HyperDescriptor-Accelerated-dynamic-property-acces
to all but remove the overhead of run time reflection.
Food for thought on the network abstraction; take a look at Rx http://msdn.microsoft.com/en-us/data/gg577609.aspx Event streams are the most elegant way I have dealt with networking and multithreaded intra-subsystem communication to date:
// Sending an object:
m_eventStream.Push(objectInstance);
// 'handling' an object when it arrives:
m_eventStream.Of(typeof(MyClass))
.Subscribe ( obj =>
{
MyClass thisInstance = (MyClass) obj;
// Code here will be run when a packet arrives and is deserialized
});
It sounds like you're trying to serialize your objects for sending over a network. I agree it's not efficient to send the full property name over the wire; this consumes way more bytes than you need.
Why not use a really fantastic library that Google invented just for this purpose.
This is the .NET port: http://code.google.com/p/protobuf-net/
In a nutshell, you define the messages you want to send such that each property has a unique id to make sending the properties more efficient:
SomeProperty = 12345
Then it just sends the id of the property and its value. It also optimizes the way it sends the values, so it might use only 1, 2, 3 bytes etc depending on how large the value is. Very clever, really.

Custom message framing protocols for TCP server/client

I use Net.Sockets.Socket class to write a TCP server. Since TCP operates on streams, one needs an approach to seperate messages from each other. (For details, see the Message Framing post of Stephen Cleary in his blog here.)
What I want to achieve is writing a TCP server class with the support for custom message framing protocols. An example initialization of this class is here:
var receiveDelimiter = Encoding.UTF8.GetBytes("[END]");
var sendDelimiter = Encoding.UTF8.GetBytes("\r\n");
var protocol = new DelimiterFramingProtocol(receiveDelimiter, sendDelimiter);
var server = new Server(protocol);
server.Start(port);
The protocol should be derived from the abstract class MessageFramingProtocol and the server should be able to use it to seperate messages. In the example above, the server should only fire its DataReceived event if the delimiter (which is "[END]") is received and the arguments of DataReceived should only have the part of the message that is before the delimiter. If there are more bytes received after the delimiter, the server should store them and fire DataReceived only when the delimiter is received again. Server also should send the sendDelimiter after every message that it sends.
What I need is not this whole server class or any of the protocol classes. What I need is a template, a design advice. Assuming I have a property of type FramingProtocol called Protocol in the server class, how can I use it in receiving and sending operations in the Server class? What abstract methods / properties it should have to provide the flexibility that you see above? I should be able to write custom protocol classes that derive from FramingProtocol. They may use delimiters, length-prefixing, both of them or other, custom approaches to seperate messages.
I wouldn't go with only one Protocol instance that is passed to the server - it will need lots of them. Provide the server with a factory class that either creates new Protocol instances or depools them from a pool created and filled at startup.
What I usually do is something like this:
RX:
Provide an 'int Protocol::addBytes(char *data,int len)' function. Fed with the address and length of raw rx data, the function returns either -1, (means that it has consumed all the raw data without fully assembling a protocol unit), or a positive integer that is the index of data consumed at the point it assembled a valid PDU. If the instance manages to assemble a PDU, it can be further processed, (eg. fired into a 'DataReceived(Protocol *thisPDU)' event and a new Protocol instance created, (or depooled), and loaded up with the remaining raw data.
TX:
Provide, (quite possibly overloaded), 'bool Protocol::loadFrom(SomeDataClass * outData, OutStreamClass *outStream)' methods that can load data from whatever source into internal member vars so that a complete set of data exists to generate a serialized PDU, (and return false, or raise an exception if there is some issue - eg. provided data fails sanity-check). If no error is detected, the instance drives the serialized data out of the passed 'outStream' stream/socket/buffer+len.

TPL Dataflow message type allocation pattern possible?

I run an algorithm that receives out-of-process messages of different types. The incoming messages are actually byte arrays and each byte array is pre-pend by a byte array flag indicating the message type. I like to understand whether it is possible to setup an IPropagator<byte[], byte[]> that processes the incoming byte arrays, interprets the byte array flags and then streams the byte array to a specific corresponding linked ActionBlock.
For example lets say I have 2 different message types and I have 2 different corresponding ActionBlocks that should only receive messages that match with the intended message type they are supposed to receive. I believe if I just link the IPropagatorBlock to both Actionblocks that both ActionBlocks will receive the same message? How can I correctly allocate each message depending on its flag (do not worry about the flag, the identification is trivial, lets assume I know at any time to which ActionBlock IPropgatorBlock wants to stream the message)? I am struggling with correctly setting up the data flow structure. I would love to be able to link the data blocks directly to each other rather than having to Post(). Is that possible?
Any help in that regards is much appreciated.
This depends on the IPropagatorBlock that you're using. If it's a custom one, you can do anything, including for example recognizing which target block to use based on the order they're linked (or something more reliable).
But assuming the block is actually a normal TransformBlock (or something like that), I think the best option is to use the overload of LinkTo() that takes a predicate, and adding the flag to the output type (which means changing the type of the block to IPropagatorBlock<byte[], Tuple<FlagType, byte[]>>, or a custom type instead of the Tuple). If you do this, then you can be sure that the target will receive the message only if the predicate for that target matches the message.
Also, what happens if you link one source block to more target blocks depends on the source block. In most cases, it sends each message to exactly one target: it first tries the first target, and only tries the second one if the first one declines or postpones the message. The exception to this rule is BroadcastBlock (and the similar WriteOnceBlock), that always tries to send each message to all targets. Again, a custom block can behave any way it wants.

Categories

Resources