[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Mechanism to request physical memory with certain properties
From: |
Michael D. Adams |
Subject: |
Re: Mechanism to request physical memory with certain properties |
Date: |
Sat, 11 Jun 2005 15:06:20 -0500 |
You've got me very curious. What is this type of mechanism intended
to be used for?
Michael D. Adams
address@hidden
(Sorry this is a month after the origonal e-mail. I've been working
though a large back log. If an e-mail send after mid-May already has
the answer to my question I won't have seen it yet.)
On 5/16/05, Matthieu Lemerre <address@hidden> wrote:
>
> Hi,
>
> Neal asked me to find a mechanism to request physical memory with
> constraints on location, so here's my proposal.
>
> I'm asking for your advises before writing code and precising some
> things. Maybe some of my explanations are also unclear. Maybe
> there's faster ways to achieve this. Please tell me!
>
>
> Mechanism to request physical memory with certain properties
> ============================================================
>
> Description:
> ------------
>
> We need a mechanism to request physical memory with constraints on location.
> Constraints are requirements on the bits on the address:
>
> 01X0X10X11X00X means that the bytes you require on the allocation must have 1
> and 0 where specified, and can be either 1 or 0 when there is a 'X'. To do
> this,
> we will pass a pair of two words like this:
>
> *bits in the first word will indicate bits which have to be on in
> the address of the physical memory eventually choosen
>
> *bits on in the second word will indicate bits which have to be off
> in the address of the physical memory eventually choosen
>
> So with the example above, 01X0X10X11X00X is represented by
> (01000100110000, 10010010000110).
>
> The allocation unit is the frame, wich is 2^p bytes (p=12 on IA32)
>
>
> Possible solution:
> ------------------
>
> Here's my proposal to solve this problem.
>
> Memory is represented by a table of bits, in which a bit is set to 1
> if the corresponding frame is free, and 0 if not (or not existing).
>
> The table is the following:
>
> |2^(N-z)-1 | 2^(N-z)-2 | ... | 1 | 0 |
> --------|------------------------------------|
> 0 | 0 1 0 0 |
> 1 | 1 1 0 1 |
> 2 | 0 1 (A) 0 1 |
> ... | |
> (2^z)-1 | 1 0 1 1 |
>
>
> Frames are numbered from 0 to (2^N)-1. The number of a frame in the table is:
> col_number * 2^z + row_number.
>
>
> For instance, frame (A) is free (its bit is set to 1), and its frame number is
> (2^(N-z)-2) * 2^z + 2.
>
>
> So, any frame number can be decomposed like this: 01001
> 101001010010010001001001
> ^^^^^
> ^^^^^^^^^^^^^^^^^^^^^^^^
> (N-z) bits z bits
>
> Similarly, we decompose the constraint word like this:
>
> 01XX1 10X10XX11X00X1 XXXXXXXXXXXX
> ^^^^^ ^^^^^^^^^^^^^^ ^^^^^^^^^^^^
> (N-z)bits z bits p undefined bits
>
> (since the allocation unit is the frame, you can't impose constraints on the
> 12
> last bits)
>
> The general idea is:
> 1/To generate a mask of possible cols (made with the N-z bits of the
> constraint word)
> 2/To iterate over the possible rows (with respect of the z bits of the
> constraint words)
> 3/In each iteration, we apply a logical AND on the mask and the row. This
> give
> which frames were allocated. Then we clear the corresponding bytes.
>
> It's simpler for the algorithm if 2^(N-z) is the size of a word ( N-z=5 on
> IA32).
>
>
> Iteration over the possible rows:
> ---------------------------------
>
> Here's the algorithm to iterate over the possible rows: (Pseudo-C code)
>
> void
> iterate_rows (word_t constraint1, word_t constraint0, function_t function)
> {
> iterate_rows_rec (constraint1 >> p, constraint0 >> p,
> 2^z, 0, function);
> }
>
> void
> iterate_rows_rec (word_t constraint1, word_t constraint0,
> word_t mask, word_t beginning_of_word, function)
> {
> if (mask == 0)
> function(beginning_of_word);
> else
> {
> if !(constraint1 & mask)
> iterate_rows_rec (constraint1, constraint0, mask>>1,
> beginning_of_word, function);
>
> if !(constraint0 & mask)
> iterate_rows_rec (constraint1, constraint0, mask>>1,
> mask | beginning_of_word, function);
> }
> }
>
> Note: the problem with this algorithm is that allocation is not distributed,
> so future allocations may take more time. But there exist workarounds, like
> the following modification to the algorithm:
>
> void
> iterate_rows_rec (word_t constraint1, word_t constraint0,
> word_t mask, word_t beginning_of_word,
> word_t pseudo_random_word, function)
> {
> if (mask == 0)
> function(beginning_of_word);
> else
> {
> if (pseudo_random_word & mask)
> {
> if !(constraint1 & mask)
> iterate_rows_rec (constraint1, constraint0, mask>>1,
> beginning_of_word, function);
>
> if !(constraint0 & mask)
> iterate_rows_rec (constraint1, constraint0, mask>>1,
> mask | beginning_of_word, function);
> }
> else
> {
> if !(constraint0 & mask)
> iterate_rows_rec (constraint1, constraint0, mask>>1,
> mask | beginning_of_word, function);
>
> if !(constraint1 & mask)
> iterate_rows_rec (constraint1, constraint0, mask>>1,
> beginning_of_word, function);
> }
> }
> }
>
> Pseudo-random word could be set so that we can indicate that some memory
> is more desirable than other. (Like memory for DMA transfers should be
> allocated
> less easily).
>
> Generation of the column mask
> -----------------------------
>
> We have to generate a column mask which will make us able to check one row at
> one time.
> If there is no constraints on the upper N-z cols, we can check 2^(N-z) frame
> at one time!
>
>
> If N-z = 5:
>
> column_mask_maker_0 = {0b01010101010101010101010101010101,
> 0b00110011001100110011001100110011,
> 0b00001111000011110000111100001111,
> 0b00000000111111110000000011111111,
> 0b00000000000000001111111111111111};
>
> column_mask_maker_1 = {0b10101010101010101010101010101010,
> 0b11001100110011001100110011001100,
> 0b11110000111100001111000011110000,
> 0b11111111000000001111111100000000,
> 0b11111111111111110000000000000000};
>
> column_mask = ~0;
>
> /* Only the first N-z bits are interresting. */
> constraint1 = constraint1>>(p+z);
> constraint0 = constraint0>>(p+z);
>
> for(i=4; i>=0; i--)
> {
> test_mask = 1 << i;
> if(constraint1 & test_mask)
> column_mask &= column_mask_maker_1[i];
> else if(constraint0 & test_mask)
> column_mask &= column_mask_maker_0[i];
> }
>
>
> Test of the row and allocation
> ------------------------------
>
> Returns the number of allocated frames in the row.
>
> int
> allocate_frames(word_t row_number, word_t col_mask, allocation_t result)
> {
> word_t allocated_frames = memory[row_number] & col_mask;
>
> /* Deallocates the allocated frames*/
> memory[row_number] &= ~col_mask;
>
> result.row_number = row_number;
> result.allocated_frames = allocated_frames;
>
> return NUMBER_OF_BITS (allocated_frames);
> }
>
>
> An allocation is a list of (row_numbers X allocated_frames):
>
> struct allocation
> {
> word_t row_number;
> word_t allocated_frames;
> struct allocation_unit *next;
> };
>
>
> Deallocation of frames
> ----------------------
>
> Thus, deallocation is trivial: We just have to re-set the right bits
> at the right place.
>
> void
> deallocate (allocation_t *allocation)
> {
> allocation_t unit = allocation;
> do
> {
> word_t row_number = unit.row_number;
> memory[row_number] |= unit.allocated_frames;
> } while(unit = unit.next);
> }
>
>
> Conclusion
> ==========
>
> Allocating memory should not be too long using this algorithm, provided that
> memory
> is distributed over the different rows. Deallocation is very fast.
>
> One possible variant would be not to have an array of bits, but to
> have an array (of size 2^z) of lists. But that would consume more
> memory and would be slowest, I think.
>
>
> Thanks,
>
> Matthieu
>
>
> _______________________________________________
> L4-hurd mailing list
> address@hidden
> http://lists.gnu.org/mailman/listinfo/l4-hurd
>
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- Re: Mechanism to request physical memory with certain properties,
Michael D. Adams <=