[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[LKDP] updated scheduler/initial.tex
From: |
KDLinux |
Subject: |
[LKDP] updated scheduler/initial.tex |
Date: |
Mon, 12 Aug 2002 05:12:47 -0700 (PDT) |
Hi all,
Here is the updated version of initial.tex of process
scheduling document.
Any comments/suggesions are welcome.
=====
-- KD.
www.kirandivekar.cjb.net
__________________________________________________
Do You Yahoo!?
HotJobs - Search Thousands of New Jobs
http://www.hotjobs.com
\chapter{Initialization}
\section{start\_kernel Function}
When a PC is powered on, it initializes all the hardware
present in the system and makes sure, each hardware is up and running. So,
after initial hardware boot sequence, kernel comes into picture by trasfering
control to \textit{start\_kernel} \index{start\_kernel} function. This function
is defined in the file \url{init/main.c}. Only the first CPU calls
start\_kernel(), while all others call function
initialize\_secondary()\footnote{defined in \url{arch/i386/kernel/smpboot.c}}.
Ihe start\_kernel function performs intialization of the all Operating Systems'
Blocks and initializes the data structures used by the kernel. This function
calls sub-functions to perform the initialization of IRQ requests, process
scheduler, softirq systems, kernel timers, signal mechanism and SMP(symmetric
multi-processing) machanism while the initialize\_secondary function just
copies the stack pointer and EIP values from the Task State Segment(TSS).
\begin{verbatim}
lock_kernel();
printk(linux_banner);
setup_arch(&command_line);
setup_per_cpu_areas();
printk("Kernel command line: %s\n", saved_command_line);
parse_options(command_line);
trap_init();
init_IRQ();
sched_init();
softirq_init();
/* timer and memory initialization code */
fork_init();
signals_init();
smp_init();
\end{verbatim}
\subsection{Macro lock\_kernel}
The macro lock\_kernel is defined in in the file
\url{include/linux/smp_lock.h}. This is only defined for non-SMP systems,
because there are no inter-CPU locks on single CPU systems. On i386 SMP
systems, lock\_kernel is an inline function: \url{incllude/asm/i386/smplock.h}
\begin{verbatim}
extern __inline__ void lock_kernel(void)
{
if (!++current->lock_depth)
spin_lock(&kernel_flag);
}
\end{verbatim}
So on a non-SMP system, the macro expands to 'do\{\} while(0)'
and gets optimised away.
\subsection{Functions trap\_init,init\_IRQ}
The functions trap\_init and init\_IRQ are architecture
dependant and perform the initilization of IRQ hardware. These functions are
defined in the architecture dependant section of kernel code. Let us look at
them one by one.
Function trap\_init() \label{init:trap_init}
\textit{File: }\url{arch/i386/kernel/traps.c}\\
This function is used to initialize the IDT with an exception
handler functions for each recognized exception. This job is accomplished
through the set\_trap\_gate and set\_system\_gate macros. The x86
microprocessors issue 20 different exceptions(0 - 19). The kernel must provide
a dedicated exception handler for each exception type. The table in
~\ref{appendix 1} shows the exception and its corresponding exception handler
along with the signals sent by the exception handlers.
\begin{verbatim}
set_trap_gate(0,÷_error);
set_trap_gate(1,&debug);
set_intr_gate(2,&nmi);
set_system_gate(3,&int3); /* int3-5 can be called from
all */
set_system_gate(4,&overflow);
set_system_gate(5,&bounds);
set_trap_gate(6,&invalid_op);
.
.
.
set_trap_gate(19,&simd_coprocessor_error);
set_call_gate(&default_ldt[0],lcall7);
set_call_gate(&default_ldt[4],lcall27);
\end{verbatim}
All these functions map to \_set\_gate macro with appropriate
parameters. Please refer to ~\ref{appendix 2} for details about different types
of gates. The macro \_set\_gate is explained below:
\begin{verbatim}
#define _set_gate(gate_addr,type,dpl,addr) \
do { \
int __d0, __d1; \
__asm__ __volatile__ ("movw %%dx,%%ax\n\t" \
"movw %4,%%dx\n\t" \
"movl %%eax,%0\n\t" \
"movl %%edx,%1" \
:"=m" (*((long *) (gate_addr))), \
"=m" (*(1+(long *) (gate_addr))), "=&a" (__d0), "=&d" (__d1)
\
:"i" ((short) (0x8000+(dpl<<13)+(type<<8))), \
"3" ((char *) (addr)),"2" (__KERNEL_CS << 16)); \
} while (0)
\end{verbatim}
% TODO Macro explaination
% Setup TSS and LDT in GDT for each task
Function trap\_init() \label{init:trap_init}
\textit{File: }\url{arch/i386/kernel/i8259.c}\\
This function is used to setup all interrupt vectors and
interrupt gates. Linux kernel uses vectors 0 to 31 for exceptions and
nonmaskable interrupts while remaining vectors are software interrupts.
Precisely, vectors 32(0x20) to 47(0x2f) are maskable interrupts, caused by IRQs
while the remaining vectors ranging from 48 to 255 may be used to identify
software interrupts. Also, Linux uses only the 128(0x80) vector, which it uses
to implement system calls.[SYSCALL\_VECTOR].
\begin{verbatim}
for (i = 0; i < NR_IRQS; i++) {
/* NR_IRQS 224
as per include/asm-i386/irq.h
/* FIRST_EXTERNAL_VECTOR = 0x20 ie. 32
as per include/asm-i386/hw_irq.h */
int vector = FIRST_EXTERNAL_VECTOR + i;
if (vector != SYSCALL_VECTOR)
set_intr_gate(vector, interrupt[i]);
}
set_intr_gate(FIRST_DEVICE_VECTOR, interrupt[0]);
set_intr_gate(RESCHEDULE_VECTOR, reschedule_interrupt);
set_intr_gate(INVALIDATE_TLB_VECTOR, invalidate_interrupt);
set_intr_gate(CALL_FUNCTION_VECTOR, call_function_interrupt);
\end{verbatim}
The interrupts gates corresponding to FIRST\_DEVICE\_VECTOR,
RESCHEDULE\_VECTOR, INVALIDATE\_TLB\_VECTOR, CALL\_FUNCTION\_VECTOR are set.
These are special IRQ vectors used by the SMP architecture, generally ranging
from 0xf0 to 0xff. Refer to \url{include/asm-i386/hw_irq.h} for more details.
\subsection{Function sched\_init}
Process is a basic entity in any unix based system. In a
multitasking environment, number of processes are executing on single/multiple
CPU/CPUs. Each process gets a fair chance to execute on the CPU depending on
the process characteristics. This allocation is done by a special process known
as "scheduler". The process scheduler is initialized by calling the function
sched\_init defined in \url{kernel/sched.c}
\subsubsection{scheduler data structure}
The runqueue data structure is explained here in order to
understand the initialization code. The scheduler uses an array of runqueues
\index{runqueues} as basic data structure defined in \url{kernel/sched.c}. The
NR\_CPUS\index{NR\_CPUS}\footnote{defined in \url{include/linux/threads.h}}
represents the number of CPUs present in the system. Its value is 32 in SMP
mode and 1 in non-SMP mode.
\begin{verbatim}
struct runqueue {
spinlock_t lock;
unsigned long nr_running, nr_switches, expired_timestamp;
signed long nr_uninterruptible;
task_t *curr, *idle;
prio_array_t *active, *expired, arrays[2];
int prev_nr_running[NR_CPUS];
task_t *migration_thread;
list_t migration_queue;
} ____cacheline_aligned;
static struct runqueue runqueues[NR_CPUS] __cacheline_aligned;
\end{verbatim}
The description of the elements of the above structure follows:
\begin{description}
\item[lock] Spinlock used by the runqueue in order to gain atomic
access of the CPU.
\item[nr\_running] Total number of runnable processes i.e. in
TASK\_RUNNING state.
\item[task\_t curr, idle] Tasks associated with the current
runqueue.\footnote{task\_t is typedefinition of task\_struct
\url{include/linux/sched.h}}
\item[prio\_array\_t arrays] Priority structure containing the priority
bitmap of size BITMAP\_SIZE along with a linked list (queue) of size MAX\_PRIO.
\item[migration\_thread] \index{migration\_thread} Migration thread
associated with the runqueue.Refer to section ~\ref{psched:structs} for more
details.
\item[migration\_queue] \index{migration\_queue} Migration queue
associated with the runqueue.Refer to section ~\ref{psched:structs} for more
details.
\end{description}
\par Each process has a process descriptor associated with it.
This process information is stored in struct task\_struct defined in
\url{include/linux/sched.h}. All process descriptors are linked together by
process list and the runqueue list links together process descriptors all the
runnable processes. In both cases, the init\_task process descriptor acts as
the list header.
Function sched\_init() \label{init:sched_init}
\par The function sched\_init initializes the runqueue data
structure for all the CPUs. It contains 2 copies of priority array structure
which are initialized to active and expired priorities. The INIT\_LIST\_HEAD
macro initializes the the linked list (queue) of priority structure of size
MAX\_PRIO, (value greater than any user task priority \url{include/sched.h} for
details) and also clears the priority bitmap.
\begin{verbatim}
for (i = 0; i < NR_CPUS; i++) {
prio_array_t *array;
rq = cpu_rq(i);
rq->active = rq->arrays;
rq->expired = rq->arrays + 1;
spin_lock_init(&rq->lock);
INIT_LIST_HEAD(&rq->migration_queue);
for (j = 0; j < 2; j++) {
array = rq->arrays + j;
for (k = 0; k < MAX_PRIO; k++) {
INIT_LIST_HEAD(array->queue + k);
__clear_bit(k, array->bitmap);
/* refer to include\asm-i386/bitops.h */
}
}
__set_bit(MAX_PRIO, array->bitmap);
}
}
\end{verbatim}
The sched\_init function also starts a process in SMP mode by
seting up the a runqueue on current CPU, its "curr" and "idle" pointers to
itself. Then it initializes all the timers by calling the function
init\_timervecs \footnote{kernel/timer.c} and initializes the bottom halves
associated with the task queue (TQUEUE\_BH) and immediate queue
(IMMEDIATE\_BH). The function wake\_up\_process is explained in the
~\ref{Scheduler Chapter}. Also, refer section ~\ref{int:bh} for more
information about \texttt{bottom halves}.
\begin{verbatim}
rq = this_rq();
rq->curr = current;
rq->idle = current;
wake_up_process(current);
init_timervecs();
init_bh(TQUEUE_BH, tqueue_bh);
init_bh(IMMEDIATE_BH, immediate_bh);
\end{verbatim}
\subsection{Function softirq\_init}
The concept of tasklets, softirqs are introduced from kernel
version 2.4 and the primary tasklet\_struct is defined in
\url{include/linux/interrupt.h}. Don't forget to read the properties of
tasklets in the include file.
Softirqs were introduced that take advantage of multiple
processors in an SMP, and allow each CPU to run a softirq. Softirqs are thus,
multithreaded analogue of Bottom Halves\index{bottom half} which can run on
multiple CPUs at once. The deprecated bottom-halves are reimplemented using
softirqs. Both bottom-halves and softirqs are statically registered. The 2.4
Linux kernel also introduce tasklets, which are dynamically registrable
softirqs, which are guaranteed to only run on one CPU at a time. Refer section
~\ref{int:tasklet} for more information about \texttt{Tasklets}.
Function softirq\_init() \label{init:softirq_init}
\textit{File: }\url{kernel/softirq.c}\\
The softirq\_init function initializes all the tasklets
\index{tasklets} by calling tasklet\_init function. Aglobal array struct
tasklet\_struct bh\_task\_vec[32] is used to initialize all the tasklets. These
tasklets are associated with first 32 (0x00-0x1F) IRQ(software interrupt)
lines. The function tasklet\_init associates the function bh\_action as the IRQ
handler for all IRQs from 0 to 31.
\par The function open\_softirq initializes the softirqs
realted to TASKLET\_SOFTIRQ and HI\_SOFTIRQ\footnote{defined in
\url{include/linux/interrupt.h}}. The function pointers tasklet\_action and
tasklet\_hi\_action are stored in an array related to softirq action (consists
of function and data).
\begin{verbatim}
static struct softirq_action softirq_vec[32]
__cacheline_aligned_in_smp;
\end{verbatim}
\begin{verbatim}
for (i=0; i<32; i++)
tasklet_init(bh_task_vec+i, bh_action, i);
open_softirq(TASKLET_SOFTIRQ, tasklet_action, NULL);
open_softirq(HI_SOFTIRQ, tasklet_hi_action, NULL);
\end{verbatim}
\subsection{Function fork\_init}
Function fork\_init() \label{init:fork_init}
\textit{File: }\url{kernel/fork.c}\\
The fork\_init function calls
kmem\_cache\_create\footnote{Refer to \url{mm/slab.c} for more detailes of slab
and memory allocation} to initialize the slab cache related to task\_structs.
It also determines the maximum number of threads depending upon the physical
memory available. The name parameter passed is \texttt{task\_struct} which can
be found in the file \url{/proc/slabinfo}. The max\_threads value is used to
set the resource limits for the init\_task.\footnote{The init\_task is defined
in a special file containing only data \url{arch/i386/kernel/init_task.c} which
calls the macro INIT\_TASK defined in \url{include/linux/init_task.h}}
\begin{verbatim}
task_struct_cachep =
kmem_cache_create("task_struct",
sizeof(struct task_struct),0,
SLAB_HWCACHE_ALIGN, NULL, NULL);
if (!task_struct_cachep)
panic("fork_init(): cannot create task_struct SLAB
cache");
max_threads = mempages / (THREAD_SIZE/PAGE_SIZE) / 8;
init_task.rlim[RLIMIT_NPROC].rlim_cur = max_threads/2;
init_task.rlim[RLIMIT_NPROC].rlim_max = max_threads/2;
\end{verbatim}
\begin{verbatim}
/* Please, Note
* value for mempages is set in the function setup_arch()
* in arch/i386/kernel/setup.c
*/
#define THREAD_SIZE (2*PAGE_SIZE) /*
include/asm-i386/thread_info.h */
/* Effectively, max_threads = mempages / 2*8 ; */
\end{verbatim}
\subsection{Function signals\_init}
The signals\_init also calls function kmem\_cache\_create to
initialize the slab cache related to signals and to create a signals related
cache. The code can be found in \url{kernel/signal.c}. The name parameter
passed is \texttt{sigqueue} which can be found in the file
\url{/proc/slabinfo}.
\begin{verbatim}
sigqueue_cachep =
kmem_cache_create("sigqueue",
sizeof(struct sigqueue),
__alignof__(struct sigqueue),
SIG_SLAB_DEBUG, NULL, NULL);
if (!sigqueue_cachep)
panic("signals_init(): cannot create sigqueue SLAB cache");
\end{verbatim}
\subsection{Function smp\_init}
The smp\_init is architecture dependant function used to
perform SMP initialization of all CPUs. The function code invokes functions
smp\_boot\_cpus, do\_boot\_cpus from \url{arch/i386/kernel/smpboot.c}. These
functions perform basic SMP related initialization. Refer to section
\ref{smp:init} for more details on SMP initialization and scheduling.
\begin{verbatim}
smp_boot_cpus();
smp_threads_ready=1;
smp_commence();
\end{verbatim}
\par After completing all the initialization stuff, what does
the kernel do? Correct, it sits idle. The function
\textit{cpu\_idle}\label{init:idle} is architecture dependant and is defined in
\url{arch/i386/kernel/process.c}. The kernel waits for some process to get
scheduled using \textit{schedule()} function.
\begin{verbatim}
while(1) {
while (!need_resched())
idle();
schedule();
}
\end{verbatim}
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- [LKDP] updated scheduler/initial.tex,
KDLinux <=