How does a cache actually store data in the "offset"? [closed]

How does a cache actually store data in the "offset"? [closed] - c#

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 8 years ago.
Improve this question
So for my computer architecture class I have to simulate a cache/memory relationship on C# and I'm just not sure how a cache actually stores data. I get the concept of a cache tag, but I really don't get the offset.
So say I have an RAM that holds 256 32bit integers. I want to have a caching system where I have a cache that holds 8 32bit integers. But then in those 32bit integers in the cache I need to add the tag and valid bit, which makes it roughly sized down to 26 bits or so. So how can I store the 32 bit data in that remaining 26 bits?
Here is the code I have right now, it's still a work in progress
class Memory
{
public List<UInt32> instructions = new List<UInt32>();
const int directCacheSize = 8;
ulong[] directCache = new ulong[directCacheSize];
private int[] stack = new int[256];
public int this[int i]
{
get
{
int directMapIndex = i % directCacheSize;
if (directCache[directMapIndex] == Convert.ToUInt64(stack[i]))
{
return Convert.ToInt32(directCache[directMapIndex] - 18446744069414584320);
}
else
{
directCache[directMapIndex] = (Convert.ToUInt64(i) << 32) + Convert.ToUInt64(stack[i]);
return stack[i];
}
}
set
{
stack[i] = value;
}
}
}

I've been trying to understand this mostly incoherent question and I think I've got it.
I was thinking originally that your fundamental error is that bits used to maintain the cache data structure bits are subtracted from the data size; that doesn't make any sense. They are added to the data size.
But then I realized that no, your fundamental error is that you have confused bits with bytes. You are subtracting six bits from 32 bytes to nonsensically get 26, but you should be adding six bits to 32 x 8 bits.
The error that is just plain confusing is that you also seem to have confused the offset into the data block with the data block itself. The data block stores the data. The offset identifies the location of the relevant data within the data block. The offset is part of an effective address, not a cache line!
You also seem to have forgotten about the dirty bit throughout.
So then a single block in my direct map cache would look like this: [tag 5bit][data 32bit]
No. The number of times that 32 appears in this problem has confused you deeply:
eight 32 bit words is 32 bytes
five bits can represent 32 possible tags
If you have 32 tags and 1024 bytes then each tag identifies 32 bytes
That's a lot of 32s and you've confused them terribly.
Start over.
Suppose you want a single line with 8 32 bit words in it, for a total of 32 bytes. What has to be in the single cache line?
the 32 bytes -- NOT BITS
a tag identifying where the 32 bytes came from
one validity bit
one dirty bit
If we assume that the 32 bytes can only be on 32-byte boundaries, and there are 1024 / 32 = 32 such boundaries, then for the tag we need log2(32) = 5 bits, so the total cache line size would be:
32 bytes NOT BITS of data
5 bits of tag
one validity bit
one dirty bit
Make sense?

Related

How do I use this CRC-32C C# library?

I've downloaded this library https://github.com/robertvazan/crc32c.net for my project I'm working on. I need to use CRC in a part of my project so I downloaded the library as it is obviously going to be much faster than anything I'm going to write in the near future.
I have some understanding of how crc works, I once made a software implementation of it (as a part of learning) that worked, but I have got to be doing something incredibly stupid while trying to get this library to work and not realize it. No matter what I do, I can't seem to be able to get crc = 0 even though the arrays were not changed.
Basically, my question is, how do I actually use this library to check for integrity of a byte array?
The way I understand it, I should call Crc32CAlgorithm.Compute(array) once to compute the crc the first time and then call it again on an array that has the previously returned value appended (I've tried to append it as well as set last 4 bytes of the array to zeroes before putting the returned value there) and if the second call returns 0 the array was unchanged.
Please help me, I don't know what I'm doing wrong.
EDIT: It doesn't work right when I do this: (yes, I realize linq is very slow, this is just an example)
using(var hash = new Crc32CAlgorithm())
{
var array = new byte[] { 1, 2, 3, 4, 5, 6, 7, 8 };
var crc = hash.ComputeHash(array);
var arrayWithCrc = array.Concat(crc).ToArray();
Console.WriteLine(string.Join(" ", hash.ComputeHash(arrayWithCrc)));
}
Console outputs: 199 75 103 72

You do not need to append a CRC to a message and compute the CRC of that in order to check a CRC. Just compute the CRC on the message on one end, send that CRC along with the message, compute CRC on just the message on the other end (not including the sent CRC), and then compare the CRC you computed to the one that was sent with the message.
They should be equal to each other. That's all there is to it. That works for any hash you might use, not just CRCs.
If you feel deeply compelled to make use of the lovely mathematical property of CRCs where computing the CRC on the message with its CRC appended gives a specific result, you can. You have to append the CRC bits in the correct order, and you need to look for the "residue" of the CRC, which may not be zero.
In your case, you are in fact appending the bits in the correct order (by appending the bytes in little-endian order), and the result you are getting is the correct residue for the CRC-32C. That residue is 0x48674bc7, which separated into bytes, in little-endian order, and then converted into decimal is your 199 75 103 72.
You will find that if you take any sequence of bytes, compute the CRC-32C of that, append that CRC to the sequence in little-endian order, and compute the CRC-32C of the sequence plus CRC, you will always get 0x48674bc7.
However that's smidge slower than just comparing the two CRC's, since now you have to compute a CRC on four more bytes than before. So, really, there's no need to do it this way.

C# 4 bit data type [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
Does C# have a 4 bit data type? I want to make a program with variables that waste the minimum amount of memory, because the program will consume a lot.
For example: I need to save a value that i know it will go from 0 to 10 and a 4 bit var can go from 0 to 15 and it's perfect. But the closest i found was the 8 bit (1 Byte) data type Byte.
I have the idea of creating a c++ dll with a custom data type. Something like nibble. But, if that's the solution to my problem, i don't know where to start, and what i have to do.
Limitations: Creating a Byte and splitting it in two is NOT an option.

No, there is no such thing as a four-bit data type in c#.
Incidentally, four bits will only store a number from 0 to 15, so it doesn't sound like it is fit for purpose if you are storing values from 0 to 127. To compute the range of a variable given that it has N bits, use the formula (2^N)-1 to calculate the maximum. 2^4 = 16 - 1 = 15.
If you need to use a data type that is less than 8 bits in order to save space, you will need to use a packed binary format and special code to access it.
You could for example store two four-bit values in a byte using an AND mask plus a bit shift, e.g.
byte source = 0xAD;
var hiNybble = (source & 0xF0) >> 4; //Left hand nybble = A
var loNyblle = (source & 0x0F); //Right hand nybble = D
Or using integer division and modulus, which works well too but maybe isn't quite as readable:
var hiNybble = source / 16;
var loNybble = source % 16;
And of course you can use an extension method.
static byte GetLowNybble(this byte input)
{
return input % 16;
}
static byte GetHighNybble(this byte input)
{
return input / 16;
}
var hiNybble = source.GetHighNybble();
var loNybble = source.GetLowNybble();
Storing it is easier:
var source = hiNybble * 16 + lowNybble;
Updating just one nybble is harder:
var source = source & 0xF0 + loNybble; //Update only low four bits
var source = source & 0x0F + (hiNybble << 4); //Update only high bits

A 4-bit data type (AKA Nib) only goes from 0-15. It requires 7 bits to go from 0-127. You need a byte essentially.

No, C# does not have a 4-bit numeric data type. If you wish to pack 2 4-bit values in a single 8-bit byte, you will need to write the packing and unpacking code yourself.

No, even boolean is 8 bits size.
You can use >> and << operators to store and read two 4 bit values from one byte.
https://msdn.microsoft.com/en-us/library/a1sway8w.aspx
https://msdn.microsoft.com/en-us/library/xt18et0d.aspx

Depending on how many of your nibbles you need to handle and how much of an issue performance is over memory usage, you might want to have a look at the BitArray and BitVector32 classes. For passing around of values, you'd still need bigger types though.
Yet another option could also be StructLayout fiddling, ... beware of dragons though.

computing 31 bit number / ignoring most significant bit

I am working on a piece of software that analyzes E01 bitstream images. Basically these are forensic data files that allow a user to compress all the data on a disk into a single file. The E01 format embeds data about the original data, including MD5 hash of the source and resulting data, etc. If you are interested in some light reading, the EWF/E01 specification is here. Onto my problem:
The e01 file contains a "table" section which is a series of 32 bit numbers that are offsets to other locations within the e01 file where the actual data chunks are located. I have successfully parsed this data out into a list doing the following:
this.ChunkLocations = new List<int>();
//hack:Will this overflow? We are adding to integers to a long?
long currentReadLocation = TableSectionDescriptorRef.OffsetFromFileStart + c_SECTION_DESCRIPTOR_LENGTH + c_TABLE_HEADER_LENGTH;
byte[] currReadBytes;
using (var fs = new FileStream(E01File.FullName, FileMode.Open))
{
fs.Seek(currentReadLocation, 0);
for (int i = 0; i < NumberOfEntries; i++)
{
currReadBytes = new byte[c_CHUNK_DATA_OFFSET_LENGTH];
fs.Read(currReadBytes,0, c_CHUNK_DATA_OFFSET_LENGTH);
this.ChunkLocations.Add(BitConverter.ToUInt32(currReadBytes, 0));
}
}
The c_CHUNK_DATA_OFFSET_LENGTH is 4 bytes/ "32 bit" number.
According to the ewf/e01 specification, "The most significant bit in the chunk data offset indicates if the chunk is compressed (1) or uncompressed (0)". This appears to be evidenced by the fact that, if I convert the offsets to ints, there are large negative numbers in the results (for chunks without compression,no doubt), but most of the other offsets appear to be correctly incremented, but every once in a while there is crazy data. The data in the ChunkLocations looks something like this:
346256
379028
-2147071848
444556
477328
510100
Where with -2147071848 it appears the MSB was flipped to indicate compression/lack of compression.
QUESTIONS: So, if the MSB is used to flag for the presence of compression, then really I'm dealing with at 31 bit number, right?
1. How do I ignore the MSB/ compute a 31 bit number in figuring the offset value?
2. This seems to be a strange standard since it would seem like it would significantly limit the size of the offsets you could have, so I'm questioning if I'm missing something? These offsets to seem correct when I navigate to these locations within the e01 file.
Thanks for any help!

This sort of thing is typical when dealing with binary formats. As dtb pointed out, 31 bits is probably plenty large for this application, because it can address offsets up to 2 GiB. So they use that extra bit as a flag to save space.
You can just mask off the bit with a bitwise AND:
const UInt32 COMPRESSED = 0x80000000; // Only bit 31 on
UInt32 raw_value = 0x80004000; // test value
bool compressed = (raw_value & COMPRESSED) > 0;
UInt32 offset = raw_value & ~COMPRESSED;
Console.WriteLine("Compressed={0} Offset=0x{1:X}", compressed, offset);
Output:
Compressed=True Offset=0x4000

If you just want to strip off the leading bit, perform a bitwise and (&) of the value with 0x7FFFFFFF

Most efficient way to store a 40 cards deck

I'm building a simulator for a 40 card's deck game. The deck is divided into 4 seeds, each one with 10 cards. Since there's only 1 seed that's different from the others ( let's say, hearts ) , I've thinked of a quite convinient way to store a set of 4 cards with the same value in 3 bits: the first two indicate how many cards of a given value are left, and the last one is a marker that tells if the heart card of that value is still in the deck.
So,
{7h 7c 7s} = 101
That allows me to store the whole deck on 30 bits of memory instead of 40. Now, when i was programming in C, I'd have allocated 4 chars ( 1 byte each = 32 bits), and played with the values with bit operations.
In C# I can't do that, since chars are 2 bytes each and playing with bits is much more of a pain, so, the question is : what's the smallest amount of memory I'll have to use to store the data required?
PS: Keep in mind that i may have to allocate 100k+ of those decks in system's memory, so saving 10 bits is quite a lot

in C, I'd have allocated 3 chars ( 1 byte each = 32 bits)
3 bytes gives you 24 bits, not 32... you need 4 bytes to get 32 bits. (Okay, some platforms have non-8-bit bytes, but they're pretty rare these days.)
In C# I can't do that, since chars are 2 bytes each
Yes, so you use byte instead of char. You shouldn't be using char for non-textual information.
and playing with bits is much more of a pain
In what way?
But if you need to store 30 bits, just use an int or a uint. Or, better, create your own custom value type which backs the data with an int, but exposes appropriate properties and constructors to make it better to work with.
PS: Keep in mind that i may have to allocate 100k+ of those decks in system's memory, so saving 10 bits is quite a lot
Is it a significant amount though? If it turned out you needed to store 8 bytes per deck instead of 4 bytes, that means 800M instead of 400M for 100,000 of them. Still less than a gig of memory. That's not that much...

In C#, unlike in C/C++, the concept of a byte is not overloaded with the concept of a character.
Check out the byte datatype, in particular a byte[], which many of the APIs in the .Net Framework have special support for.

C# (and modern versions of C) have a type that's exactly 8 bits: byte (or uint8_t in C), so you should use that. C char usually is 8 bits, but it's not guaranteed and so you shouldn't rely on that.
In C#, you should use char and string only when dealing with actual characters and strings of characters, don't treat them as numbers.

Reading 4 bits without losing information

I have come across a problem I cannot seem to solve.
I have a type of file "ASDF", and in their header I can get the necessary information to read them. The problem is that one of the "fields" is only 4 bits long.
So, lets say it's like this:
From bit 0 to 8 it's the index of the current node (I've read this already)
From 8 to 16 it's the index for the next node (Read this as well)
From bit 16 to 20 Length of the content (string, etc..)
So my problem is that if I try to read the "length" with a bytereader I will be losing 4 bits of information, or would be "4 bits off". Is there any way to read only 4 bits?

You should read this byte as you read the others then apply a bitmask of 0x0F
For example
byte result = (byte)(byteRead & 0x0F);
this will preserve the lower four bits in the result.
if the needed bits are the high four then you could apply the shift operator
byte result = (byte)((byteRead & 0x0F) >> 5);

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.