Title: Access to video memory
1Access to video memory
- We create a Linux device-driver that gives
applications access to our graphics frame-buffer
2The role of a device-driver
A device-driver is a software module that
controls a hardware device in response to OS
kernel requests relayed, often, from an
application
hardware device
i/o memory
in
out
RAM
device-driver module
user application
ret
call
call
ret
Operating System kernel
syscall
standard runtime libraries
sysret
user space
kernel space
3Raster Display Technology
The graphics screen is a two-dimensional array of
picture elements (pixels)
These pixels are redrawn sequentially,
left-to-right, by rows from top to bottom
Each pixels color is an individually
programmable mix of red, green, and blue
4Special dual-ported memory
CRT
CPU
VRAM
RAM
16-MB of VRAM
2048-MB of RAM
5How much VRAM is needed?
- This depends on (1) the total number of pixels,
and on (2) the number of bits-per-pixel - The total number of pixels is determined by the
screens width and height (measured in pixels) - Example when our screen-resolution is set to
1280-by-960, we are seeing 1,228,800 pixels - The number of bits-per-pixel (color depth) is a
programmable parameter (varies from 1 to 32) - Certain types of applications also need to use
extra VRAM (for multiple displays, or for
special effects like computer game animations)
6How truecolor works
0
8
16
24
alpha
red
green
blue
longword
R
G
B
pixel
The intensity of each color-component within a
pixel is an 8-bit value
7x86 uses little-endian order
truecolor graphics-modes use 4-bytes per
picture-element
0 1 2 3
4 5 6 7
8 9 10
B
G
R
B
G
R
B
G
R
VRAM
Video Screen
8Some operating system issues
- Linux is a protected-mode operating system
- I/O devices normally are not directly accessible
- Linux on x86 platforms uses virtual memory
- Privileged software must map the VRAM
- A device-driver module is needed vram.c
- We can compile it using mmake vram
- Device-node mknod /dev/vram c 98 0
- Make it writable chmod aw /dev/vram
9Our vram.c module
- Its a character-mode Linux device-driver
- It implements four device-file methods
- read() lets a program read from video memory
- write() lets a program write to video memory
- llseek() lets a program move the files
pointer - mmap() lets a program map vram to user-space
- It also implements a pseudo-file that lets users
view the RADEON X300 graphics controllers PCI
Configuration Space parameter-values - cat /proc/vram
10What is PCI?
- Its an acronym for Peripheral Component
Interconnect and refers to a collection of
industry standards for devices used in PCs - An Intel-sponsored initiative (from 1992-9)
having several ambitious goals - Reduce diversity inherent in legacy PC devices
- Improve speed and efficiency of data-transfers
- Eliminate (or reduce) platform dependencies
- Simplify adding/removing peripheral adapters
- Lower PCs total consumption of electrical power
11PCI Configuration Space
A non-volatile parameter-storage area for each
PCI device-function
PCI Configuration Space Header (16 doublewords
fixed format)
PCI Configuration Space Body (48 doublewords
variable format)
64 doublewords
12Example Header Type 0
16 doublewords
31
0
31
0
Dwords
Status Register
Command Register
Device ID
Vendor ID
1 - 0
BIST
Cache Line Size
Class Code Class/SubClass/ProgIF
Revision ID
Latency Timer
Header Type
3 - 2
Base Address 0
Base Address 1
5 - 4
Base Address 2
Base Address 3
7 - 6
Base Address 4
Base Address 5
9 - 8
Subsystem Device ID
Subsystem Vendor ID
CardBus CIS Pointer
11 - 10
reserved
capabilities pointer
Expansion ROM Base Address
13 - 12
Minimum Grant
Interrupt Pin
reserved
Interrupt Line
Maximum Latency
15 - 14
13Examples of VENDOR-IDs
- 0x8086 Intel Corporation
- 0x1022 Advanced Micro Devices, Inc
- 0x1002 Advanced Technologies, Inc
- 0x10EC RealTek, Incorporated
- 0x10DE Nvidia Corporation
- 0x10B7 3Com Corporation
- 0x101C Western Digital, Inc
- 0x1014 IBM Corporation
- 0x0E11 Compaq Corporation
- 0x1057 Motorola Corporation
- 0x106B Apple Computers, Inc
- 0x5333 Silicon Integrated Systems, Inc
14Examples of DEVICE-IDs
- 0x5347 ATI RAGE128 SG
- 0x4C58 ATI RADEON LX
- 0x5950 ATI RS480
- 0x436E ATI IXP300 SATA
- 0x438C ATI IXP600 IDE
- 0x5B60 ATI Radeon X300
- See this Linux header-file for lots more
examples - lt/usr/src/linux/include/linux/pci_ids.hgt
15Defined PCI Class Codes
- 0x00 Legacy Device (i.e., built before
class-codes were defined) - 0x01 Mass Storage controller
- 0x02 Network controller
- 0x03 Display controller
- 0x04 Multimedia device
- 0x05 Memory Controller
- 0x06 Bridge device
- 0x07 Simple Communications controller
- 0x08 Base System peripherals
- 0x09 Input device
- 0x0A Docking stations
- 0x0B Processors
- 0x0C Serial Bus controllers
- 0x0D Wireless controllers
- 0x0E Intelligent I/O controllers
- 0x0F Encryption/Decryption controllers
- 0x10 Satellite Communications controllers
- 0x11 Data Acquisition and Signal Processing
controllers
16Example of Sub-Class Codes
- Class Code 0x01 Mass Storage controller
- 0x00 SCSI controller
- 0x01 IDE controller
- 0x02 Floppy Disk controller
- 0x03 IPI controller
- 0x04 RAID controller
- 0x80 Other Mass Storage controller
17Example of Sub-Class Codes
- Class Code 0x02 Network controller
- 0x00 Ethernet controller
- 0x01 Token Ring controller
- 0x02 FDDI controller
- 0x03 ATM controller
- 0x04 ISDN controller
- 0x80 Other Network controller
18Example of Sub-Class codes
- Class Code 0x03 Display Controller
- 0x00 VGA-compatible controller
- 0x01 XGA controller
- 0x02 3D controller
- 0x80 Other display controller
19Hardware details may differ
- Graphics controllers use vendor-specific
mechanisms to perform similar operations - Theres a common core of compatibility with IBMs
VGA (Video Graphics Array) developed in the
mid-1980s, but since IBMs loss of market
dominance, each manufacturer has added
enhancements which employ incompatible
programming interfaces you need a vendors
manual!
20The frame-buffer
- Todays PCI graphics systems all provide a
dedicated amount of display memory to control the
screen-images pixel-coloring - But how much memory will vary with price
- And its location within the CPUs physical
address-space cant be predicted because it
depends upon what other PCI devices are installed
(and mapped) during startup
21The base address fields
- The PCI Configuration Header has several
so-called Base Addess fields, and vendors use one
of these to hold the frame-buffers starting
address and to indicate how much vram the video
controller can actually use - The Linux kernel provides driver-writers with
some convenient functions for getting the
location and size of the frame-buffer
22Radeon uses Base Address 0
- Our vram.c modules initialization routine
employs these kernel helper-functions -
include ltlinux/pci.hgt struct pci_dev devp
// for a variable that will point to a
kernel-structure // get a pointer to the PCI
devices Linux data-structure devp
pci_get_device( VENDOR_ID, DEVICE_ID, NULL ) if
( !devp ) return ENODEV // device is not
present // get starting address and length for
memory-resource 0 vram_base
pci_resource_start( devp, 0 ) vram_size
pci_resource_len( devp, 0 )
23Reading from vram
- You can use our fileview utility to see the
current contents of the video frame-buffer - fileview /dev/vram
- Our vram.c drivers read() method gets
invoked when an application-program attempts to
read from the /dev/vram device-file - The read-method is implemented by our driver
using ioremap() (and iounmap()) to
temporarily map a 4KB-page of physical vram to
the kernels virtual address-space
24I/O memcpy() functions
- Linux provides a platform-independent way to do
copying from an i/o-devices memory into an
applications buffer (or vice-versa) - A read copies from vram to a users buffer
- memcpy_fromio( buf, vaddr, len )
- A write copies to vram from a users buffer
- memcpy_toio( vaddr, buf, len )
25mmap()
- This is a standard UNIX system-call that lets an
application map a file into its virtual
address-space, where it can then treat the file
as if it were an ordinary array - See the man-page man mmap
- This same system-call can also work on a
device-file if that devices driver provided
mmap() among its file-operations
26The user-role
- In the application-program, six arguments get
passed to the mmap() library-function - int mmap( (void)baseaddress,
- int memorysize,
- int accessattributes,
- int flags,
- int filehandle,
- int offset )
27The driver-role
- In the kernel, those six arguments will get
validated and processed, then the drivers
mmap() callback-function will be invoked to
supply missing information and perform further
sanity-checks and do appropriate page-mapping
actions - int mmap( struct file file,
- struct vm_area_struct vma )
28Our drivers code
int mmap( struct file file, struct
vm_area_struct vma ) // extract the paramers
we will need from the vm_area_struct unsigned
long region_length vma-gtvm_end
vma-gtvm_start unsigned long region_origin
vma-gtvm_pgoff PAGE_SIZE unsigned
long physical_addr fb_base region_origin uns
igned long user_virtaddr vma-gtvm_start //
sanity check mapped region cannot extend past
end of vram if ( region_origin region_length gt
fb_size ) return EINVAL // tell the kernel
not to try swapping out this region to the
disk vma-gtvm_flags VM_RESERVED // tell the
kernel to exclude this region from any core
dumps vma-gtvm_flags VM_IO
29Drivers code continued
// invoke a helper-function that will set up
the page-table entries if ( remap_pfn_range(
vma, user_virtaddr, physical_addr gtgt
12, region_length, vma-gtvm_page_prot ) ) return
EAGAIN return 0 // SUCCESS
30Demo rotation.cpp
- This application-program will demonstrate use of
our vram.c device-drivers read(), write()
and llseek() methods (i.e., device-file
operations) - It will perform a rotation of the
color-components (R,G,B) in every displayed
truecolor pixel R ? G G ? B B ?
R - After 3 times the screen will look normal again
31Demo inherit.cpp
- This application-program will demonstrate use of
the mmap() method in our driver, and the fact
that memory-mappings which a parent-process
creates will be inherited by a child-process - You will see a rectangular purple border drawn on
your display -- provided the program-parameters
match your screen
32In-class exercise
- Can you adapt the ideas in inherit.cpp to
create a program (named backward.cpp) that will
reverse the ordering of the pixels in each
screen-row?