Realtek rtl8139 Network Interface Card

Introduction

This post will explain all I know about developing a driver for the Realtek rtl8139 network adapter. It is a network interface card that is capable of 10 / 100 Mbit/s network speeds. It is emulated by qemu which makes it a prime target for learning about driver development.

To enable the rtl8139 in qemu, use

qemu-system-i386 -net nic,model=rtl8139 -fda <your_image>

The osdev wiki says:

If you find your driver suddenly freezes and stops receiving interrupts and you’re using kvm/qemu. Try the option -no-kvm-irqchip

Links

Initialization Process

In order to initialize the network card, there are several settings to configure. There are two types of locations where settings have to be applied:

  1. PCI Configuration Space
  2. ioaddr
Finding the Device in the System

The rtl8139 is connected to the PCI bus. With PCI, every device is identified by a pair of IDs. The pair consists of a vendor ID and a device ID. The rtl8139 has a vendor ID of 0x10ec and a device ID of 0x8139. You can check any pair of vendor ID and device ID on https://www.pcilookup.com or http://pciengine.com/.

First, you have to check if the system has the rtl8139 build into it (or if qemu does emulate the card) by listing all the PCI devices of the system and searching for the vendor and device ID. PCI device listing is described here.

PCI Configuration Space

PCI (Peripheral Component Interconnect) is a way to configure hardware on a pure software basis. Extension cards you put into your PC via a PCI slot, are part of the PCI system.

A PCI system consists of up to 256 busses, each bus can contain up to 32 devices, every device can be a package of up to 8 functions. That means a PCI extension card can act as up to 8 devices when plugged into the PC. Each of these devices will get it’s own function via PCI.

A PC usually only contains a single PCI bus, so instead of using 256 busses it only contains 1.

One of these function in one of the devices on one of the busses will be the RTL 8139 but it is not predefined which one it is. That means the tuple of (bus, device, function) is unknown and the driver has to find the device.

In order to find the touple, there are several ways to do it. The simplest way is to iterate over all busses, devices and functions. On each function, using the current touple (bus, device, function) it is possible to read data from the device at those coordinates. If the touple does not point to an existing device, contine with the next touple. If there is a device at the touple, it is possible to read the so called PCI configuration space of that device. The configuration space contains several registers on the PCI Hardware. Two important registers are the vendor and device registers.

Here is a general depiction of the PCI configuration space of PCI card. You can see that the first four byte contain the vendor Id and the device Id.

Reading and writing the PCI configuration space is done via ports. A port is a memory address that points to hardware instead of a memory cell. Ports can be used to write and read data and to communicate with hardware instead of writing and reading to memory. First you have to specify the address you want to manipulate by writing data to the configuration address 0xCF8. Once that location is configured, you can read or write data by reading and writing to the configuration data address 0xCFC.

The RTL 8319 (and every PCI card) has specific values for vendor and device. Knowing these values, the card can be identified and the touple (bus, device, function) can be found.

Having the touple (bus, device, function) the driver can start with the configuration.

The code to iterate and find the RTL 8139 is listed here:

const u32int PCI_ENABLE_BIT = 0x80000000;
const u32int PCI_CONFIG_ADDRESS = 0xCF8;
const u32int PCI_CONFIG_DATA = 0xCFC;

// func - 0-7
// slot - 0-31
// bus - 0-255
//
// described here: https://en.wikipedia.org/wiki/PCI_configuration_space under
// the section "software implementation"
// parameter pcireg: 0 will read the first 32bit dword of the pci control space
// which is DeviceID and Vendor ID
// pcireg = 1 will read the second 32bit dword which is status and command
// and so on...
u32int r_pci_32(u8int bus, u8int device, u8int func, u8int pcireg) {

  // compute the index
  //
  // pcireg is left shifted twice to multiply it by 4 because each register
  // is 4 byte long (32 bit registers)
  u32int index = PCI_ENABLE_BIT | (bus << 16) | (device << 11) | (func << 8) |
                 (pcireg << 2);

  // write the index value onto the index port
  outl(index, PCI_CONFIG_ADDRESS);

  // read a value from the data port
  return inl(PCI_CONFIG_DATA);
}

int realtek8319Found = 0;

unsigned char pci_bus = 0;
unsigned char pci_device = 0;
unsigned char pci_device_fn = 0;

// there are 256 busses allowed
for (bus = 0; bus != 0xff; bus++) {

// per bus there can be at most 32 devices
for (device = 0; device < 32; device++) {

  // every device can be multi function device of up to 8 functions
  for (func = 0; func < 8; func++) {

    // read the first dword (index 0) from the PCI configuration space
    // dword 0 contains the vendor and device values from the PCI configuration space
    data = r_pci_32(bus, device, func, 0);
    if (data != 0xffffffff) {
       
       // parse the values
       u16int device_value = (data >> 16);
       u16int vendor = data & 0xFFFF;

       // check vendor and device against the values of the RTL 8139 PCI device
       realtek8319Found = 0;
       if (vendor == 0x10ec && device_value == 0x8139) {

        realtek8319Found = 1;

        pci_bus = bus;
        pci_device = device;
        pci_device_fn = func;

        k_printf("RTL8139 found! bus: %d", pci_bus);
        k_printf(" device: %d", pci_device);
        k_printf(" func: %d \n", pci_device_fn);
      }

    }
  }
}

 

ioaddr

If the Realtek 8139 is build into a PC, it gets a ioaddr assigned during system boot. The device is mapped into memory at that ioaddr. By writing or reading data from memory at that ioaddr, the operating system can configure the card.

The ioaddr can be read from the PCI configuration space at the byte 4. Byte 4 is where the command register starts. The ioaddr is stored in the lowest three bits of the command register.

// read the ioaddr/base_address
u32int pci_ioaddr = r_pci_32(pci_bus, pci_device, pci_device_fn, 4);
k_printf("pci_ioaddr: 0x%x \n", pci_ioaddr);

unsigned long ioaddr = pci_ioaddr & ~3;
k_printf("ioaddr: 0x%x \n", ioaddr);

Using the ioaddr, the driver can power up the card.

Powering up the card

Write the value 0 into the config1 address via the ioaddr.

// write a byte out to the specified port.
void outb(u8int value, u16int port) {

  __asm__ __volatile__("outb %1, %0" : : "dN"(port), "a"(value));
}

outb(0x00, ioaddr + Config1);
k_printf("starting chip done.\n");
Bus Mastering

Next step is to enable bus mastering. If you do not enable bus mastering, qemu will not transfer any data between the memory of the operating system and the memory on the RTL 8139 network card but it will transfer zeroes instead.

A transfer of data is necessery to send a packet or to receive packets. To send a packet, first the data is copied from the memory of the operating system into a buffer on the PCI card. From the buffer the card transfers the data onto the wire.

The transfer of data is performed via DMA (Direct Memory Access). If a PCI card is not assigned rights to be the bus master, it cannot perform DMA. Only the bus master is allowed to perform DMA. (Sidenote: It was reported that on some real hardware, enabling bus mastering is not needed. qemu was updated to make bus mastering mandatory. If you test on qemu, you need this step)

If bus mastering is turned off, qemu will not copy any data to the card but it will only copy zeroes.

The same goes for receiving. The PCI card receives data from the wire and writes that data into a buffer. The operating system will copy data from the buffer into the memory of the operating system via DMA. If bus mastering is turned off, qemu will only transfer zeroes instead of the real data.

To enable bus mastering, you have to set bit 3 (zero indexed, bit3 is actually the fourth bit if you start counting from 1 instead from 0) inside the command register.

The bit is set by reading the command register, flipping bit 3 and writing the value back into the command register.

// https://wiki.osdev.org/RTL8139
// enable bus mastering in the command register
// Some BIOS may enable Bus Mastering at startup, but some versions
// of qemu don't. You should thus be careful about this step.
k_printf("BUS mastering ...\n");

u16int command_register =
    pci_read_word(pci_bus, pci_device, pci_device_fn, 0x04);

k_printf("BUS mastering command_register = %x\n", command_register);

command_register |= 0x04;

pci_write_word(pci_bus, pci_device, pci_device_fn, 0x04, command_register);

command_register = pci_read_word(pci_bus, pci_device, pci_device_fn, 0x04);

k_printf("BUS mastering command_register = %x\n", command_register);
Software Reset

Next is a software reset

// software reset
// https://wiki.osdev.org/RTL8139
// Sending 0x10 to the Command register (0x37) will send the RTL8139 into a
// software reset. Once that byte is sent, the RST bit must be checked to
// make sure that the chip has finished the reset. If the RST bit is high
// (1), then the reset is still in operation.

// ChipCmd is the Command Register 0x37 = 55
// 0x10 == 0001 0000 == bit 5
// k_printf("Reset the chip %d ...\n", i);
outb(0x10, ioaddr + ChipCmd);
while ((inb(ioaddr + ChipCmd) & 0x10) != 0) {
  k_printf("waiting for reset!\n");
}
k_printf("Reset done.\n");
Enable Receiver and Transmitter
// enable receiver and transmitter
// Sets the RE and TE bits high
// k_printf("Enable receiver and transmitter %d...\n", i);
// 0x0C = 1100 = bit 2 und bit 3
outb(0x0C, ioaddr + ChipCmd);
k_printf("Enable receiver and transmitter done.\n");
Set Transmit and Receive Configuration Registers
// https://www.lowlevel.eu/wiki/RTL8139
// CR (Transmit Configuration Register, 0x40, 4 Bytes) und RCR
// (Receive Configuration Register, 0x44, 4 Bytes) setzen.
outl(0x03000700, ioaddr + TxConfig);
outl(0x0000070a, ioaddr + RxConfig);
Configuration Done

At this point the RTL 8139 is ready to send and receive data. Next the Sending of data is explained.

Sending Data

The data to send is written into a buffer (byte array) in operating system memory. Then the buffer is transferred over to the card via DMA (which is why the driver enables bus mastering). You have to specify the physical address for DMA! The PCI card does not understand paging! It only reads from memory at physical locations and does no go through the  memory management unit.

My tip for you is to turn off paging during your initial tests with the RTL 8139 just to rule out that source of error.

TSAD and TSD

The way that the RTL 8139 accepts data for sending is explained in this section. On a more abstract level, the card has four hardware buffers for sending.  Those buffers are also called descriptors. At any one point in time, there is only a single hardware buffer active. After the reset of the card during initialization, the buffer with index 0 is the active buffer.

The card will send the data stored in the currently active hardware buffer and then make the next hardware buffer in line the active buffer. Once data has been send from buffer 3, the index is reset to 0 and 0 is active again.

Each one of the four hardware buffers is implemented via registers which are available via two memory locations. There is a memory location called TSAD and one called TSD per hardware buffer.

TSAD is the transmission start register. It has to contain the physical address of the buffer that contains the data that the operating system wants to send. The data is transferred between the operating system and the card via DMA in the first step. Once the data is stored in the card’s internal memory, it is transferred onto the wire from there.

TSD is the transmission status or transmission control register and has to be set to contain the length of the data to send in bits 0 to 12 which is the length of the buffer in TSAD in bytes. Also the bit 13 (OWN bit) has to be set to 0. If the OWN bit is zero (low), the hardware on the RTL 8139 card will start to transmit the data to the card and from the card onto the wire. If the DMA transfer between the operationg system and the card was successfull, the OWN bit is set to 1 (high) by the hardware. Once the OWN bit is high, the card will start to transfer the data from the cards internal memory over the wire. I think that the name OWN was choosen to tell the user that the card now owns the data to transfer.

For each of the four buffers there is a pair of TSAD and TSD. The addresses are:

// TSAD = Transmit Start Registers = 32bit = Physical Address of data to
// be sent
u8int TSAD_array[4] = {0x20, 0x24, 0x28, 0x2C};

// TSD - Transmit Status / Command Registers = 32bit
u8int TSD_array[4] = {0x10, 0x14, 0x18, 0x1C};

The operating system has to remember which is the currently active buffer because it is not possible to ask the RTL 8139 card about which buffer is active at the moment. The variable tx_cur is used to store the index of the active buffer.

int tx_cur = 0;

The operating system prepares a buffer (array of byte / char) of data to send. For this example, let’s send 256 bytes containing the ASCII character ‘A’ which has the hex code 0x41 or decimal code 65.

int len = 256;
unsigned char tx_buffer[len];
for (int i = 0; i < len; i++) {
    tx_buffer[i] = 'A';
}

The variable len stores the size of the buffer.

Fill TSAD and TSD of the currently active buffer with the data to send.

// Second, fill in physical address of data to TSAD
outl(tx_buffer, ioaddr + TSAD_array[tx_cur]);

// Fill the length to TSD and start the transmission by setting the OWN
// bit to 0 Start https://wiki.osdev.org/RTL8139#Transmitting_Packets
u32int status = 0;
status |= len & 0x1FFF; // 0-12: Length
status |= 0 << 13;      // 13: OWN bit

outl(status, ioaddr + TSD_array[tx_cur]);

Wait until the OK bit (bit 15) is high. This signals that the transmission is completed. The OWN bit will tell you, when the data was transferred between the operating system and the card. Once the data is stored on the card, it will start to transmit that data over the wire. Once the wire transfer is complete, the card will set the OK bit in the TSD to high which means that the transfer is done and the next transfer buffer is active.

u32int transmit_ok = inl(ioaddr + TSD_array[tx_cur]);
while (transmit_ok & (1 << 15) == 0) {
    k_printf("Waiting for transmit_ok ...\n");
    transmit_ok = inl(ioaddr + TSD_array[tx_cur]);
}
k_printf("Waiting for transmit_ok done. transmit_ok = %d\n", transmit_ok);

Tell the operating system which buffer is active after the last buffer was used. In order to do that, increment tx_cur and wrap around back to zero if the last buffer was used in the prior send operation.

tx_cur++;
if (tx_cur > 3) {
    tx_cur = 0;
}

Now that you are able to send an arbitray byte array into the network, you have to learn how to construct valid ethernet frames for a protocol such as ARP, ICMP, DHPC, TCP, IP, HTTP or anything else. This is not the RTL 8139 driver’s job so the details are not explained in this article.

Constructing the frames for a specific protocol in the OSI model is the job of the so called IP-stack.

Retrieving the MAC Address

The RTL 8139 sends and receives data and is therefore a part of a network. As such it needs an address so packets can be sent point to point between the sender and the receiver.

On the lower levels of the OSI stack where Ethernet frames are sent, the MAC address is used for this purpose. A MAC address is a unique address assigned to a RLT 8139 during manufacturing.

When implementing ARP for example, you need to know the MAC address of your card. This section explains how to retrieve the NIC’s MAC address.

On qemu, you can specify the MAC address on the command line. Knowing the MAC address when testing code is a big advantage because as soon as you retrieve the expected MAC address, it is proven that the code works correctly.

The qemu command line parameter mac specifies the mac address.

/home/<user>/dev/qemu/build/i386-softmmu/qemu-system-i386 \
-monitor stdio \
-cdrom image.iso \
-netdev user,id=network0 \
-device rtl8139,netdev=network0,mac=52:54:00:12:34:56 \
-object filter-dump,id=network_filter_object,netdev=network0,file=dump.dat

Here 52:54:00:12:34:56 is used as a mac address.

The MAC address is stored in a EEPROM chip on the card. To read the EEPROM you need a function.

// Delay between EEPROM clock transitions.
// No extra delay is needed with 33Mhz PCI, but 66Mhz may change this.
#define eeprom_delay() inl(ee_addr)

// The EEPROM commands include the alway-set leading bit.
#define EE_WRITE_CMD (5 << 6)
#define EE_READ_CMD (6 << 6)
#define EE_ERASE_CMD (7 << 6)

static int read_eeprom(long ioaddr, int location) {

  unsigned retval = 0;
  long ee_addr = ioaddr + Cfg9346;
  int read_cmd = location | EE_READ_CMD;

  outb(EE_ENB & ~EE_CS, ee_addr);
  outb(EE_ENB, ee_addr);

  // Shift the read command bits out.
  for (int i = 10; i >= 0; i--) {

    int dataval = (read_cmd & (1 << i)) ? EE_DATA_WRITE : 0;

    outb(EE_ENB | dataval, ee_addr);
    eeprom_delay();

    outb(EE_ENB | dataval | EE_SHIFT_CLK, ee_addr);
    eeprom_delay();
  }

  outb(EE_ENB, ee_addr);
  eeprom_delay();

  for (int i = 16; i > 0; i--) {

    outb(EE_ENB | EE_SHIFT_CLK, ee_addr);
    eeprom_delay();

    retval = (retval << 1) | ((inb(ee_addr) & EE_DATA_READ) ? 1 : 0);

    outb(EE_ENB, ee_addr);
    eeprom_delay();
  }

  // Terminate the EEPROM access.
  outb(~EE_CS, ee_addr);

  return retval;
}

Using this function, the MAC can be read and stored into an array. The array is then output to show that the correct MAC address is read.

// prepare mac address read
int mac_address_index = 0;
u32int mac_address[6];
for (int i = 0; i < 6; i++) {
  mac_address[i] = 0;
}

// Read EEPROM
//
// Read the MAC Addresses from the NIC's EEPROM memory chip
// k_printf("read_eeprom() ...\n");

int readEEPROMResult = read_eeprom(ioaddr, 0) != 0xffff;
if (readEEPROMResult) {

  // loop three times to read three int (= 32 bit)
  for (int i = 0; i < 3; i++) {

    u16int data = read_eeprom(ioaddr, i + 7);

    mac_address[mac_address_index] = data & 0xFF;
    mac_address[mac_address_index + 1] = data >> 8;

    mac_address_index += 2;
  }

} else {

  // loop six times
  for (int i = 0; i < 6; i++) {

    u16int data = inb(ioaddr + i);

    mac_address_index += 1;
  }
}

// DEBUG: print MAC Address
k_printf("MAC: ");
for (int i = 0; i < 6; i++) {
  k_printf("%x:", mac_address[i]);
}
k_printf("\n");

 

Debugging

This section will introduce you to two ways of debugging the process of sending data using the RTL 8139.

The first method is telling qemu to dump all incoming and outgoing packages to a file. The file is in the pcap format which makes it possible to open the file in wireshark. wireshark is a networking tool that can display all field in ethernet packages and knows a large array of protocols for detailed display of all fields in packets.

If your RTL driver sends data, you can look at what data is send by loading the dump file and looking at the send packets using wireshark.

The second method is to compile qemu and enable the debug output in the emulation layer of the RTL 8139 card. Sadly there is no command line parameter to enable the RTL 8139 debug output. You can only enable the debug output by changing qemu’s code and and compiling qemu. This sounds hard but it actually is pretty easy. If I managed to do it, you will easily be able to do it as well. This method was only tested on a Ubuntu linux. The steps to compile on windows or mac are unknown to me. You can follow method 2 on Ubuntu linux easily.

Dumping Network Traffic with qemu

qemu internally contains so called objects for diverse purposes. One of those objects is the filter-dump object. You can apply the filter-dump object to one of the network interface cards to dump all packets into a file.

/home/<user>/dev/qemu/build/i386-softmmu/qemu-system-i386 \
-monitor stdio \
-cdrom image.iso \
-netdev user,id=network0 \
-device rtl8139,netdev=network0,mac=52:54:00:12:34:56 \
-object filter-dump,id=network_filter_object,netdev=network0,file=dump.dat

The filter-dump object is pointed to the netdev. It will capture traffic on that netdev. The netdev is the RTL 8139 NIC. The output file is called dump.dat it is written into the folder where you start qemu.

Open dump.dat in qemu. You should see the packet you have sent! If the RTL 8139 only sends zeroes, check that you are specifying virtual addresses and check the code that enables bis mastering.

Compile qemu and Enable RTL Debug Output
https://forum.osdev.org/viewtopic.php?f=1&t=28285

In qemu/hw/net/rtl8139.c

#define DEBUG_RTL8139 1
replace by
define DEBUG_RTL8139 1

Build Qemu
https://en.wikibooks.org/wiki/QEMU/Installing_QEMU

0. sudo apt-get install libglib2.0-dev libpango1.0-dev libatk1.0-dev libsdl2-dev
1. git clone git://git.qemu-project.org/qemu.git
2. cd qemu
3. git submodule init
4. git submodule update --recursive
5. git submodule status --recursive
6. git checkout stable-4.1
7. mkdir build
8. cd build
9. ../configure --disable-kvm --prefix=PFX --target-list="i386-softmmu x86_64-softmmu" --enable-sdl
10. make

In step 6, replace the version number with the most current qemu release.
In step 9, the command specifies targets and only lists i386. That way only x86 32 bit qemu is built.
If you call ../configure without additional parameters, qemu will be build for all possible targets which will take forever.

The qemu executable will be placed inside build folder. For example in /home/<user>/dev/qemu/build/i386-softmmu/qemu-system-i386 

Now qemu will output debug statements to the command line. You should see lines like these:

RTL8139: +++ transmitting from descriptor 0
RTL8139: +++ transmit reading 42 bytes from host memory at 0x0010504a
RTL8139: +++ transmitted 42 bytes from descriptor 0

 

 

Leave a Reply