6.1810 Lab1 util: Unix utilities

本文最后更新于 346 天前，内容如有失效请评论区留言。

MIT 6.1810: Operating System Engineering 是麻省理工学院大名鼎鼎的 PDOS 实验室开设的面向MIT本科生的操作系统课程。开设这门课的教授之一 —— Robert Morris 教授曾是一位顶尖黑客，世界上第一个蠕虫病毒 Morris 就是出自他之手。

这门课的前身是 MIT 著名的课程 6.828，MIT 的几位教授为了这门课曾专门开发了一个基于 x86 的教学用操作系统 JOS，被众多名校作为自己的操统课程实验。但随着 RISC-V 的横空出世，这几位教授又基于 RISC-V 开发了一个新的教学用操作系统 xv6，并开设了 MIT6.S081 这门课。由于 RISC-V 轻便易学的特点，学生不需要像此前 JOS 一样纠结于众多 x86 “特有的”为了兼容而遗留下来的复杂机制，而可以专注于操作系统层面的开发。

MIT 6.1810 是我一直想要学习的课程。本科的 OS 课是纯理论课，尽管学习和移植了 RTOS，但对 OS 的理解仍旧尚少，这也让我学习 Linux 内核很吃力，推免结束后趁时间空余就想着刷刷学习下吧，抱着学习和欣赏的态度认识一下传说中的 xv6 —— 精简优美的 UNIX 实现。

由于 Lab1 因为各种原因拖了很久才完成，完成后文档也拖了好几天才动笔，因此所以笔者文档尽量从简，有些细节步骤和探索懒得写了或者忘记了。

🌟笔者综合考虑最终选择了 2023 版本的 xv6-labs，Xv6 代码实现以及相关学习用到的 PDF 资料请参见：GitHub 传送门⭐。

实验环境搭建

安装 Ubuntu 20.04

参考 Tools Used in 6.1810，建议使用 Ubuntu-20.04，为了避免后续环境不一致带来的麻烦，我就乖乖装了一个新 WSL 配了一个全新的环境。（后续才知道 Ubuntu-22.04 体验更好，这是一个吃力不讨好的决定）

wsl --install -d Ubuntu-20.04

安装 QEMU

Building QEMU for Linux，除了 required package，还需要手动安装 flex 和 bison。

注意：QEMU 不要按照实验文档直接用 apt 就装了，这样装的 QEMU 版本太老，甚至达不到实验文档上的 5.1.0 版本。因此笔者这里选择了手动编译安装 QEMU，考虑到 20.04 版本避免兼容性问题，笔者选取了一个较新但不是最新的版本（没有选最新的 9.x 的 QEMU）。

qemu-system-riscv64 --version
QEMU emulator version 8.0.5
Copyright (c) 2003-2022 Fabrice Bellard and the QEMU Project developers

Xv6 Debug

这部分需要花一定时间来探索了，可以 Google：xv6 vscode 或者 xv6 debug。笔者仅简单列出参考过的视频和资料。

Xv6 代码导读 (调试工具配置；调试系统调用执行) [南京大学2022操作系统-P18]

在 VSCode 中调试 xv6

MIT 6.S081 xv6调试不完全指北

关于 riscv64-unknown-elf-gdb 和 gdb-dashboard 的配置可以参考下面的文档。笔者用的 riscv64-unknown-elf-gdb 版本为 12.1，更新版本的在 20.04 似乎装不上。

rCore-Tutorial-使用 GDB 对 rCore 进行 debug*

关于 GDB：

GDB – Init File

GDB调试多线程及多进程

关于如何调试 xv6 上的 user program：

可以通过 add-symbol-file 或者 file 来添加 symbol，例如 add-symbol-file user/_sleep。
打断点可以精确到源文件，例如 b sleep.c:7。
调试时可以仅启用 1 个 CPU，即 make CPUS=1 qemu-gdb。

笔者实际上已经都配置好了 gdb 和 vscode debug 都配好了，懒得再写了，后续有新的进展可能会补坑吧。

MIT-6.1810 启动

MIT-6.1810，启动！

下图只是为了展示 gdb-dashboard 在 tmux 中可以分窗格和设置 history 数量（但就是有点麻烦），

gdb-dashboard debug xv6，

一遍顺利启动 xv6

Lab 1

Lab: Xv6 and Unix utilities

开始之前可以先阅读一下 Lab guidance，

首先 clone 仓库，

git clone git://g.csail.mit.edu/xv6-labs-2023
cd xv6-labs-2023

更改 push 源为自己的仓库：

git remote set-url --push origin git@github.com:LRL52/xv6-labs-2023.git

Sleep

easy，nothing to explain.

#include "kernel/types.h"
#include "user/user.h"

int main(int argc, char *argv[]) {
    int n;

    if (argc != 2) {
        fprintf(2, "usage: sleep <ticks>\n");
        exit(1);
    }

    n = atoi(argv[1]);
    sleep(n);
    exit(0);

    return 0;
}

pingpong

easy，nothing to explain.

#include "kernel/types.h"
#include "user/user.h"

#define PIPE_READ 0
#define PIPE_WRITE 1

int main() {
    int pipe1[2], pipe2[2];
    char c;

    pipe(pipe1); // create pipe1 for parent to child
    pipe(pipe2); // create pipe2 for child to parent

    if (fork() == 0) {
        // child
        close(pipe1[PIPE_WRITE]);
        close(pipe2[PIPE_READ]);

        read(pipe1[PIPE_READ], &c, 1);
        printf("%d: received ping\n", getpid());
        write(pipe2[PIPE_WRITE], "c", 1);

        close(pipe1[PIPE_READ]);
        close(pipe2[PIPE_WRITE]);
    } else {
        // parent
        close(pipe1[PIPE_READ]);
        close(pipe2[PIPE_WRITE]);

        write(pipe1[PIPE_WRITE], "c", 1);
        read(pipe2[PIPE_READ], &c, 1);
        printf("%d: received pong\n", getpid());

        close(pipe1[PIPE_WRITE]);
        close(pipe2[PIPE_READ]);
    }

    return 0;
}

primes

Bell Labs and CSP Threads

传说中的多进程筛素数，很早就看过这个图了，原来是出自这里。

这个就很有意思了，通过 pipe 来筛素数，同时可以让你很好的理解作为一个 programmer 如何使用 fork 和 pipe 来实现多进程的管道通信。

笔者完成之前没有参考现有的代码，并且笔者的代码没有使用递归来实现。需要注意只有最左边的进程是直接从 range [2, 35] 取数，后面的进程都是从左边进程的 pipe 里读数，这点在实现的时候笔者通过 id 变量来区分当前进程是否为第 1 个进程。同时注意 xv6 下每个进程能打开的 file descriptor 有限，每个进程最多打开 16 个 fd（在 kernel/param.h 中有定义 \#define NOFILE 16）。

#include "kernel/types.h"
#include "user/user.h"

#define PIPE_READ 0
#define PIPE_WRITE 1

int main() {
    int pipe_left[2], pipe_right[2];
    int id = 0, initialized = 0;
    int p, x = 2, rc;

BEGIN:
    if (id == 0) {
        p = 2;
    } else {
        read(pipe_left[PIPE_READ], &p, 4);
    }

    printf("prime %d\n", p);

    while (1) {
        if (id == 0) {
            x += 1;
            if (x > 35) {
                break;
            }
        } else {
            rc = read(pipe_left[PIPE_READ], &x, 4);
            if (rc == 0) { // the write-side of the pipe has been closed
                break;
            }
        }
        if (x % p != 0) {
            if (initialized) {
                write(pipe_right[PIPE_WRITE], &x, 4);
            } else {
                pipe(pipe_right);
                if (fork() == 0) {
                    if (id > 0) {
                        close(pipe_left[PIPE_READ]);
                    }
                    memcpy(pipe_left, pipe_right, sizeof(pipe_left));
                    close(pipe_left[PIPE_WRITE]);
                    ++id;
                    goto BEGIN;
                } else {
                    close(pipe_right[PIPE_READ]);
                }
                initialized = 1;
                write(pipe_right[PIPE_WRITE], &x, 4);
            }
        }
    }

    if (id > 0) {
        close(pipe_left[PIPE_READ]);
    }
    if (initialized) {
        close(pipe_right[PIPE_WRITE]);
        wait(&rc);
    }

    return 0;
}

find

这一个需要参考 user/ls.c 中实现的文件遍历，主体逻辑基本上直接抄过来，再改改就行。

#include "kernel/types.h"
#include "kernel/stat.h"
#include "kernel/fs.h"
#include "kernel/fcntl.h"
#include "user/user.h"

#define BUF_SIZE 512

char *get_filename(char *path) {
    char *p;

    for (p = path + strlen(path); p >= path && *p != '/'; --p);
    ++p;

    return p;
}

void find(char *path, char *filename) {
    int fd;
    struct dirent de;
    struct stat st;
    char *p;

    if ((fd = open(path, O_RDONLY)) < 0) {
        fprintf(2, "find: cannot open %s\n", path);
        return;
    }

    if (fstat(fd, &st) < 0) {
        fprintf(2, "find: cannot stat %s\n", path);
        close(fd);
        return;
    }

    switch (st.type) {
        case T_FILE:
            if (strcmp(get_filename(path), filename) == 0) {
                printf("%s\n", path);
            }
            break;

        case T_DIR:
            if (strlen(path) + 1 + DIRSIZ + 1 > BUF_SIZE) { // "1" for "/" and "1" for null terminator
                printf("find: path too long\n");
                break;
            }
            p = path + strlen(path);
            *p++ = '/'; // p points to position after '/' and will not change in the following loop
            while (read(fd, &de, sizeof(de)) == sizeof(de)) {
                if (de.inum == 0) {
                    continue;
                }
                if (strcmp(de.name, ".") == 0 || strcmp(de.name, "..") == 0) {
                    continue;
                }
                memmove(p, de.name, DIRSIZ);
                p[DIRSIZ] = 0;
                if (stat(path, &st) < 0) {
                    fprintf(2, "find: cannot stat %s\n", path);
                    continue;
                }
                find(path, filename);
            }
            break;
    }

    close(fd);
}

int main(int argc, char *argv[]) {
    static char buf[BUF_SIZE];

    if (argc != 3) {
        fprintf(2, "usage: find <path> <filename>\n");
        exit(1);
    }

    strcpy(buf, argv[1]);
    find(buf, argv[2]);

    return 0;
}

xargs

xargs 也是一个很常用的命令。

注意在命令行参数中，argv[0] 是运行程序本身，并且在数量上也计入 argc，argv[argc] 必须为 NULL。

#include "kernel/types.h"
#include "kernel/param.h"
#include "user/user.h"

#define STDIN 0
#define BUF_SIZE 512

int main(int argc, char *argv[]) {
    char *new_argv[MAXARG];
    char buf[BUF_SIZE];
    int i, n, rc, new_argc;

    if (argc < 2) {
        fprintf(2, "usage: xargs command\n");
        exit(1);
    }

    new_argc = argc - 1;
    for (i = 0; i < new_argc; ++i) {
        new_argv[i] = argv[i + 1];
    }

    while (1) {
        n = 0;
        while ((rc = read(STDIN, buf + n, 1)) > 0) { // read one character at a time until newline('\n')
            if (buf[n] == '\n') {
                break;
            }
            ++n;
        }
        if (rc == 0) {
            break;
        }
        buf[n] = 0;
        new_argv[new_argc] = buf;
        new_argv[new_argc + 1] = 0; // argv[] must be terminated by NULL
        if (fork() == 0) {
            exec(argv[1], new_argv);
            exit(1);
        }
        wait(0);
    }


    return 0;
}

测试

Xv6 Book Chapter 1

Xv6 金句摘录，以下仅记录 Xv6 book 中笔者觉得 make sense 或者有用或者自己不太熟悉的部分：

Unix provides a narrow interface whose mechanisms combine well, offering a surprising degree of generality.

The kernel uses the hardware protection mechanisms provided by a CPU to ensure that each process executing in user space can access only its own memory.

The fact that the shell is a user program, and not part of the kernel.

A subsequent read will return the bytes following the ones returned by the first read. Each write picks up where the previous one left off.（read，write 会在上一次 offset 的基础上继续读写）

A newly allocated file descriptor is always the lowest-numbered unused descriptor of the current process.（新分配的文件描述符始终是当前进程中编号最小的未使用描述符）

Why it is helpful that fork and exec are separate calls？

A：Between the two, the shell has a chance to redirect the child’s I/O without disturbing the I/O setup of the main shell.

Each underlying file offset is shared between parent and child.（每个底层文件偏移量在父进程和子进程之间共享）

The dup system call duplicates an existing file descriptor, returning a new one that refers to the same underlying I/O object. Both file descriptors share an offset, just as the file descriptors duplicated by fork do.

Open with the O_CREATE flag creates a new data file, and mknod creates a new device file.

mknod creates a special file that refers to a device. Associated with a device file are the major and minor device numbers.

A file’s name is distinct from the file itself; the same underlying file, called an inode, can have multiple names, called links. Each link consists of an entry in a directory; the entry contains a file name and a reference to an inode. An inode holds metadata about a file, including its type (file or directory or device), its length, the location of the file’s content on disk, and the number of links to a file.

Each inode is identified by a unique inode number.

fd = open("/tmp/xyz", O_CREATE|O_RDWR);
unlink("/tmp/xyz");

上述是创建没有名称的临时索引节点的惯用方法，该索引节点将在进程关闭时被清理 fd 或退出。

Xv6 is not POSIX compliant: it is missing many system calls (including basic ones such as lseek), and many of the system calls it does provide differ from the standard.

Xv6 does not provide a notion of users or of protecting one user from another; in Unix terms, all xv6 processes run as root.

Reference

PS：这门课的资源简直不要太丰富，课程有精校中文字幕的可以听，不想听还可以直接看文字版笔记。教材还有中文翻译的，实验文档也有中文的（见 xv6.dgs.zone，但中文版的内容不齐全建议阅读英文的）。以及开始之前还有 jyy 大佬带你领学 xv6 代码。

MIT 6.1810: Operating System Engineering 2023

MIT 6.S081 课堂文字版笔记

xv6: a simple, Unix-like teaching operating system

xv6-riscv-book-zh-cn 课程教材翻译

6.S081-All-In-One-Gitbook(xv6.dgs.zone)

The xv6 Kernel（哔哩哔哩）

The xv6 Kernel（YouTube）

yt

Windows Edge 131.0.0.0

11 月前
2025-1-04 14:03:11

请问用vscode是怎么使用qemu monitor的，比如想info mem该怎么办
- admin52
  博主
  yt
  
  Android Chrome 131.0.0.0
  
  11 月前
  2025-1-04 14:33:30
  
  这个肯定只能在 terminal 里操作啦。其实 gdb 命令行调试的自由度更高，虽然开始我配了 vscode 调试环境，但实际上我是 vscode 写代码开发，gdb 终端调试
- - yt
    
    admin52
    
    Windows Edge 131.0.0.0
    
    11 月前
    2025-1-04 15:03:16
    
    求教，用terminal咋操作，教授说ctrl+A+C可以调出qemu monitor，但我按了没反应呀
  - - admin52
      博主
      yt
      
      Android Chrome 131.0.0.0
      
      11 月前
      2025-1-04 15:13:22
      
      先按 Ctrl+A，按了后松开，再按 C。如果还没反应，检查下是否有键位冲突了，比如和 tmux 键位冲突
    - - yt
        
        admin52
        
        Windows Edge 131.0.0.0
        
        11 月前
        2025-1-04 15:20:08
        
        是用windows的cmd进入wsl，然后操作吗，我是这么操作的但是没用
      - admin52
        博主
        yt
        
        Android Chrome 131.0.0.0
        
        11 月前
        2025-1-04 17:18:27
        
        是的，我也用的 WSL 完成的 labs

标签云

最新文章

实验环境搭建

安装 Ubuntu 20.04

安装 QEMU

Xv6 Debug

MIT-6.1810 启动

Lab 1

Sleep

pingpong

primes

find

xargs

测试

Xv6 Book Chapter 1

Reference

评论

发送评论编辑评论

2025 年 11 月
一	二	三	四	五	六	日
					1	2
3	4	5	6	7	8	9
10	11	12	13	14	15	16
17	18	19	20	21	22	23
24	25	26	27	28	29	30

实验环境搭建

安装 Ubuntu 20.04

安装 QEMU

Xv6 Debug

MIT-6.1810 启动

Lab 1

Sleep

pingpong

primes

find

xargs

测试

Xv6 Book Chapter 1

Reference

评论

发送评论 编辑评论

发送评论编辑评论