waitid 系统调用及示例

好的,我们来深入学习 waitid 系统调用

1. 函数介绍

在 Linux 系统中,当一个进程创建了子进程(使用 fork),父进程通常需要知道子进程何时结束(退出或被终止),以及它是如何结束的(正常退出码、被哪个信号杀死等)。这是进程管理和资源回收的重要环节。

waitid 系统调用就是用来让父进程(或具有适当权限的进程)等待一个或一组子进程的状态发生变化,并获取该变化的详细信息

你可以把它想象成一个“进程状态监听器”。父进程调用 waitid 后,它会挂起(阻塞),直到它感兴趣的子进程发生了指定类型的事件(比如退出、被信号终止、停止、继续等)。当事件发生时,waitid 会返回,并把详细信息(哪个子进程、如何结束的)填充到一个结构体中。

waitid 相比于老一些的 wait 和 waitpid,提供了更强大和灵活的功能

简单来说,waitid 就是让你用程序来“等待”并“获取”子进程的“死亡/停止/恢复”通知书,并且通知书上写得非常详细。

2. 函数原型

#include <sys/wait.h> // 包含 waitid 函数声明和相关常量

int waitid(idtype_t idtype, id_t id, siginfo_t *infop, int options);

3. 功能

挂起调用进程,直到由 idtype 和 id 指定的一个或多个子进程的状态发生变化(变化类型由 options 指定)。当满足条件的子进程状态改变时,将详细的状态信息填充到 infop 指向的 siginfo_t 结构体中。

4. 参数详解

  • idtype:
    • idtype_t 类型。
    • 指定要等待的进程的类型。它决定了 id 参数的含义。常见的值有:
      • P_PID: 等待由 id 指定的特定进程 ID (PID) 的子进程。
      • P_PGID: 等待进程组 ID (PGID) 等于 id 的所有子进程。
      • P_ALL: 等待调用进程的所有子进程(此时 id 参数被忽略)。
  • id:
    • id_t 类型。
    • 其含义由 idtype 决定:
      • 如果 idtype 是 P_PID,则 id 是要等待的子进程的 PID。
      • 如果 idtype 是 P_PGID,则 id 是要等待的子进程组的 PGID。
      • 如果 idtype 是 P_ALL,则 id 被忽略(通常设为 0)。
  • infop:
    • siginfo_t * 类型。
    • 一个指向 siginfo_t 结构体的指针。当 waitid 成功返回时,该结构体会被内核填充为关于已改变状态的子进程的详细信息。
    • siginfo_t 结构体包含很多字段,关键的有:
      • si_pid: 导致状态改变的子进程的 PID。
      • si_status: 子进程的退出状态或导致其状态改变的信号编号。
      • si_code: 状态改变的原因代码,例如:
        • CLD_EXITED: 子进程通过 exit() 或从 main 返回正常退出。
        • CLD_KILLED: 子进程被信号杀死。
        • CLD_DUMPED: 子进程被信号杀死并产生了核心转储 (core dump)。
        • CLD_STOPPED: 子进程被信号(如 SIGSTOP)停止。
        • CLD_CONTINUED: 子进程从停止状态被 SIGCONT 信号恢复继续运行。
      • … 还有其他字段。
  • options:
    • int 类型。
    • 一个位掩码,用于指定要等待的状态变化类型以及调用的行为。可以是以下值的按位或 (|) 组合:
      • 状态类型 (必须至少指定一个):
        • WEXITED: 等待子进程正常退出(调用 exit() 或从 main 返回)。
        • WSTOPPED: 等待子进程被停止(通常是收到 SIGSTOPSIGTSTPSIGTTINSIGTTOU 信号)。
        • WCONTINUED: 等待被停止的子进程恢复运行(收到 SIGCONT 信号)。
      • 行为标志 (可选):
        • WNOHANG非阻塞。如果没有任何子进程的状态符合条件,waitid 立即返回 0,而不挂起调用进程。
        • WNOWAIT不收割。获取子进程状态信息,但不将其从内核的子进程表中删除。这意味着后续的 wait 调用仍可能获取到该子进程的信息。

5. 返回值

  • 成功: 返回 0。
  • 失败: 返回 -1,并设置全局变量 errno 来指示具体的错误原因。

6. 错误码 (errno)

  • ECHILD: 没有符合条件的子进程。例如,指定了一个不存在的 PID,或者使用 WNOHANG 时没有子进程处于可收割状态。
  • EINTR: 系统调用被信号中断。
  • EINVALidtype 或 options 参数无效。

7. 相似函数或关联函数

  • wait: 最基础的等待子进程退出的函数。它等待任意一个子进程退出,并返回 PID 和状态码(需要使用宏如 WIFEXITEDWEXITSTATUS 等来解析)。pid_t wait(int *wstatus);
  • waitpidwait 的增强版。允许指定等待特定 PID 的子进程,或使用 WNOHANG 等选项。pid_t waitpid(pid_t pid, int *wstatus, int options);
  • wait3 / wait4: 更老的函数,功能与 waitpid 类似,但可以额外返回资源使用信息(struct rusage)。
  • siginfo_twaitid 使用的关键数据结构,包含详细的子进程状态信息。

8. 示例代码

下面的示例演示了如何使用 waitid 来等待不同类型的子进程事件。

#define _GNU_SOURCE // 启用 GNU 扩展
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/wait.h>   // 包含 waitid, siginfo_t 等
#include <signal.h>     // 包含 kill, SIG* 常量
#include <string.h>
#include <errno.h>

// 辅助函数:打印 siginfo_t 中的信息
void print_siginfo(const siginfo_t *info) {
    printf("  Child PID: %d\n", info->si_pid);
    printf("  Signal/Exit Code: %d\n", info->si_status);
    printf("  Reason Code: ");
    switch (info->si_code) {
        case CLD_EXITED:
            printf("CLD_EXITED (Child called exit())\n");
            printf("    Exit Status: %d\n", info->si_status);
            break;
        case CLD_KILLED:
            printf("CLD_KILLED (Child was killed by signal)\n");
            printf("    Signal Number: %d\n", info->si_status);
            break;
        case CLD_DUMPED:
            printf("CLD_DUMPED (Child killed by signal and dumped core)\n");
            printf("    Signal Number: %d\n", info->si_status);
            break;
        case CLD_STOPPED:
            printf("CLD_STOPPED (Child was stopped by signal)\n");
            printf("    Stop Signal Number: %d\n", info->si_status);
            break;
        case CLD_CONTINUED:
            printf("CLD_CONTINUED (Child continued)\n");
            // si_status for CLD_CONTINUED is not defined to be meaningful
            break;
        default:
            printf("Unknown reason code: %d\n", info->si_code);
            break;
    }
}

int main() {
    pid_t pid1, pid2, pid3;
    siginfo_t info;

    printf("--- Demonstrating waitid ---\n");
    printf("Parent PID: %d\n", getpid());

    // 1. 创建一个会正常退出的子进程
    pid1 = fork();
    if (pid1 == 0) {
        // --- Child 1 ---
        printf("[Child 1, PID %d] Running for 3 seconds then exiting with status 42.\n", getpid());
        sleep(3);
        exit(42);
    }

    // 2. 创建一个会被信号杀死的子进程
    pid2 = fork();
    if (pid2 == 0) {
        // --- Child 2 ---
        printf("[Child 2, PID %d] Running for 5 seconds then will be killed by SIGTERM.\n", getpid());
        sleep(5);
        // 这行不会执行到
        exit(0);
    }

    // 3. 创建一个会停止和恢复的子进程
    pid3 = fork();
    if (pid3 == 0) {
        // --- Child 3 ---
        printf("[Child 3, PID %d] Running, then will stop and continue.\n", getpid());
        printf("[Child 3] Entering loop, press Ctrl+Z in another terminal to stop me (if I'm foreground).\n");
        printf("[Child 3] Or, the parent will send SIGSTOP and SIGCONT.\n");
        int counter = 0;
        while (counter < 10) {
            printf("[Child 3] Working... %d\n", counter++);
            sleep(1);
        }
        printf("[Child 3] Finished normally.\n");
        exit(100);
    }

    // --- Parent Process ---
    printf("[Parent] Created children: PID1=%d, PID2=%d, PID3=%d\n", pid1, pid2, pid3);

    // 稍等一下,让子进程启动
    sleep(1);

    // 4. 向 Child 3 发送 SIGSTOP 使其停止
    printf("\n[Parent] Sending SIGSTOP to Child 3 (PID %d)...\n", pid3);
    if (kill(pid3, SIGSTOP) == -1) {
        perror("[Parent] kill SIGSTOP");
    }

    // 等待 Child 3 停止
    printf("[Parent] Waiting for Child 3 to stop using waitid(WSTOPPED)...\n");
    memset(&info, 0, sizeof(info)); // 清零结构体
    if (waitid(P_PID, pid3, &info, WSTOPPED) == -1) {
        perror("[Parent] waitid for stop");
    } else {
        printf("[Parent] Detected Child 3 stopped:\n");
        print_siginfo(&info);
    }

    // 5. 等待 Child 1 正常退出
    printf("\n[Parent] Waiting for Child 1 to exit using waitid(WEXITED)...\n");
    memset(&info, 0, sizeof(info));
    if (waitid(P_PID, pid1, &info, WEXITED) == -1) {
        perror("[Parent] waitid for exit pid1");
    } else {
        printf("[Parent] Detected Child 1 exited:\n");
        print_siginfo(&info);
    }

    // 6. 向 Child 2 发送 SIGTERM 使其终止
    printf("\n[Parent] Sending SIGTERM to Child 2 (PID %d)...\n", pid2);
    if (kill(pid2, SIGTERM) == -1) {
        perror("[Parent] kill SIGTERM");
    }

    // 等待 Child 2 被杀死
    printf("[Parent] Waiting for Child 2 to be killed using waitid(WEXITED)...\n");
    memset(&info, 0, sizeof(info));
    if (waitid(P_PID, pid2, &info, WEXITED) == -1) {
        perror("[Parent] waitid for exit pid2");
    } else {
        printf("[Parent] Detected Child 2 killed/exited:\n");
        print_siginfo(&info);
    }

    // 7. 向 Child 3 发送 SIGCONT 使其恢复
    printf("\n[Parent] Sending SIGCONT to Child 3 (PID %d)...\n", pid3);
    if (kill(pid3, SIGCONT) == -1) {
        perror("[Parent] kill SIGCONT");
    }

    // 等待 Child 3 恢复运行 (这个可能不会立即发生,取决于子进程何时真正恢复)
    // 更常见的是等待它最终退出
    printf("[Parent] Waiting for Child 3 to continue and then exit using waitid(WCONTINUED | WEXITED)...\n");
    printf("[Parent] (WCONTINUED detection might be unreliable, waiting for exit instead)\n");
    memset(&info, 0, sizeof(info));
    // 通常我们只等待最终的退出
    if (waitid(P_PID, pid3, &info, WEXITED) == -1) {
        perror("[Parent] waitid for exit pid3");
    } else {
        printf("[Parent] Detected Child 3 exited:\n");
        print_siginfo(&info);
    }

    // 8. 演示 WNOHANG (非阻塞)
    printf("\n[Parent] Demonstrating WNOHANG...\n");
    printf("[Parent] Calling waitid(P_ALL, 0, info, WEXITED | WNOHANG)...\n");
    memset(&info, 0, sizeof(info));
    int result = waitid(P_ALL, 0, &info, WEXITED | WNOHANG);
    if (result == -1) {
        perror("[Parent] waitid WNOHANG");
    } else if (result == 0) {
        // 如果返回 0,表示成功调用,但没有符合条件的子进程状态改变
        // 因为我们已经等待了所有子进程退出,所以这里应该没有更多可收割的
        printf("[Parent] WNOHANG returned 0: No children available to wait for.\n");
    }

    printf("\n[Parent] All children have been waited for. Parent exiting.\n");

    printf("\n--- Summary ---\n");
    printf("1. waitid(idtype, id, infop, options) waits for child process state changes.\n");
    printf("2. idtype/id let you specify which child/children to wait for (PID, PGID, ALL).\n");
    printf("3. options specify what events to wait for (WEXITED, WSTOPPED, WCONTINUED).\n");
    printf("4. WNOHANG makes it non-blocking. WNOWAIT gets status without reaping.\n");
    printf("5. infop (siginfo_t*) provides detailed information about the event.\n");
    printf("6. It's more flexible and informative than wait/waitpid.\n");

    return 0;
}

9. 编译和运行

# 假设代码保存在 waitid_example.c 中
gcc -o waitid_example waitid_example.c

# 运行程序
./waitid_example

10. 预期输出

--- Demonstrating waitid ---
Parent PID: 12345
[Child 1, PID 12346] Running for 3 seconds then exiting with status 42.
[Child 2, PID 12347] Running for 5 seconds then will be killed by SIGTERM.
[Child 3, PID 12348] Running, then will stop and continue.
[Child 3] Entering loop, press Ctrl+Z in another terminal to stop me (if I'm foreground).
[Child 3] Or, the parent will send SIGSTOP and SIGCONT.
[Parent] Created children: PID1=12346, PID2=12347, PID3=12348

[Parent] Sending SIGSTOP to Child 3 (PID 12348)...
[Parent] Waiting for Child 3 to stop using waitid(WSTOPPED)...
[Child 3] Working... 0
[Child 3] Working... 1
[Parent] Detected Child 3 stopped:
  Child PID: 12348
  Signal/Exit Code: 19
  Reason Code: CLD_STOPPED (Child was stopped by signal)
    Stop Signal Number: 19

[Parent] Waiting for Child 1 to exit using waitid(WEXITED)...
[Child 1] Running for 3 seconds then exiting with status 42.
[Parent] Detected Child 1 exited:
  Child PID: 12346
  Signal/Exit Code: 42
  Reason Code: CLD_EXITED (Child called exit())
    Exit Status: 42

[Parent] Sending SIGTERM to Child 2 (PID 12347)...
[Parent] Waiting for Child 2 to be killed using waitid(WEXITED)...
[Child 2] Running for 5 seconds then will be killed by SIGTERM.
[Parent] Detected Child 2 killed/exited:
  Child PID: 12347
  Signal/Exit Code: 15
  Reason Code: CLD_KILLED (Child was killed by signal)
    Signal Number: 15

[Parent] Sending SIGCONT to Child 3 (PID 12348)...
[Parent] Waiting for Child 3 to continue and then exit using waitid(WCONTINUED | WEXITED)...
[Parent] (WCONTINUED detection might be unreliable, waiting for exit instead)
[Child 3] Working... 2
[Child 3] Working... 3
[Child 3] Working... 4
[Child 3] Working... 5
[Child 3] Working... 6
[Child 3] Working... 7
[Child 3] Working... 8
[Child 3] Working... 9
[Child 3] Finished normally.
[Parent] Detected Child 3 exited:
  Child PID: 12348
  Signal/Exit Code: 100
  Reason Code: CLD_EXITED (Child called exit())
    Exit Status: 100

[Parent] Demonstrating WNOHANG...
[Parent] Calling waitid(P_ALL, 0, info, WEXITED | WNOHANG)...
[Parent] WNOHANG returned 0: No children available to wait for.

[Parent] All children have been waited for. Parent exiting.

--- Summary ---
1. waitid(idtype, id, infop, options) waits for child process state changes.
2. idtype/id let you specify which child/children to wait for (PID, PGID, ALL).
3. options specify what events to wait for (WEXITED, WSTOPPED, WCONTINUED).
4. WNOHANG makes it non-blocking. WNOWAIT gets status without reaping.
5. infop (siginfo_t*) provides detailed information about the event.
6. It's more flexible and informative than wait/waitpid.

11. 总结

waitid 是一个功能强大且信息丰富的系统调用,用于等待子进程状态变化。

  • 核心优势
    • 灵活性高:可以精确指定等待哪个进程/进程组,以及等待哪种类型的事件(退出、停止、恢复)。
    • 信息详细:通过 siginfo_t 结构体返回非常详细的子进程状态信息,比 wait/waitpid 的 wstatus 整数更易于理解和使用。
    • 功能完整:支持停止/恢复事件的等待(WSTOPPEDWCONTINUED)。
  • 参数idtype/id 定义范围,options 定义事件类型和行为,infop 接收结果。
  • 使用场景
    • 需要精确控制等待哪个子进程。
    • 需要区分子进程是正常退出、被信号杀死还是停止/恢复。
    • 编写复杂的进程管理器或守护进程。
  • 与 wait/waitpid 的关系
    • wait(&status) 基本等价于 waitpid(-1, &status, 0)
    • waitpid(pid, &status, options) 功能是 waitid 的子集。
    • waitid 提供了 waitpid 所没有的 WSTOPPED/WCONTINUED 等选项(除非使用非标准扩展),以及更详细的信息返回方式。
发表在 linux文章 | 留下评论

writev系统调用及示例

writev 函数详解

1. 函数介绍

writev 是Linux系统调用,用于向文件描述符写入多个分散的缓冲区数据(scatter-gather I/O)。它是 write 函数的增强版本,允许一次系统调用写入多个不连续的内存区域,减少了系统调用的开销,提高了I/O性能。

2. 函数原型

#include <sys/uio.h>
ssize_t writev(int fd, const struct iovec *iov, int iovcnt);

3. 功能

writev 将多个分散的缓冲区数据原子性地写入到指定的文件描述符中。它使用分散/聚集I/O(scatter-gather I/O)机制,可以显著减少系统调用次数,提高大量小数据块写入的性能。

4. 参数

  • int fd: 目标文件描述符
  • *const struct iovec iov: iovec结构体数组,描述多个缓冲区
  • int iovcnt: iovec数组中的元素个数

5. 返回值

  • 成功: 返回实际写入的字节数
  • 失败: 返回-1,并设置errno

6. 相似函数,或关联函数

  • readv: 对应的读取函数
  • write: 基本写入函数
  • sendmsg/recvmsg: 网络套接字的分散/聚集I/O
  • preadv/pwritev: 带偏移量的分散/聚集I/O

7. 示例代码

示例1:基础writev使用

#include <sys/uio.h>
#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <errno.h>
#include <fcntl.h>

/**
 * 演示基础writev使用方法
 */
int demo_writev_basic() {
    const char *header = "HTTP/1.1 200 OK\r\n";
    const char *content_type = "Content-Type: text/html\r\n";
    const char *content_length = "Content-Length: 25\r\n";
    const char *connection = "Connection: close\r\n";
    const char *blank_line = "\r\n";
    const char *body = "<html><body>Hello</body></html>";
    
    struct iovec iov[6];
    int fd;
    ssize_t bytes_written;
    
    printf("=== 基础writev使用示例 ===\n");
    
    // 准备iovec数组
    iov[0].iov_base = (void*)header;
    iov[0].iov_len = strlen(header);
    
    iov[1].iov_base = (void*)content_type;
    iov[1].iov_len = strlen(content_type);
    
    iov[2].iov_base = (void*)content_length;
    iov[2].iov_len = strlen(content_length);
    
    iov[3].iov_base = (void*)connection;
    iov[3].iov_len = strlen(connection);
    
    iov[4].iov_base = (void*)blank_line;
    iov[4].iov_len = strlen(blank_line);
    
    iov[5].iov_base = (void*)body;
    iov[5].iov_len = strlen(body);
    
    // 显示要写入的数据
    printf("准备写入的数据:\n");
    for (int i = 0; i < 6; i++) {
        printf("  缓冲区 %d: %.*s", i + 1, (int)iov[i].iov_len, (char*)iov[i].iov_base);
        // 如果不是以换行符结尾,添加换行符
        if (iov[i].iov_len > 0 && ((char*)iov[i].iov_base)[iov[i].iov_len - 1] != '\n') {
            printf("\n");
        }
    }
    
    printf("\n总数据长度: %zu 字节\n", 
           iov[0].iov_len + iov[1].iov_len + iov[2].iov_len + 
           iov[3].iov_len + iov[4].iov_len + iov[5].iov_len);
    
    // 写入到标准输出(演示用途)
    printf("\n1. 使用writev写入到标准输出:\n");
    bytes_written = writev(STDOUT_FILENO, iov, 6);
    if (bytes_written == -1) {
        perror("writev 失败");
        return -1;
    }
    printf("  成功写入 %zd 字节\n", bytes_written);
    
    // 写入到文件
    printf("\n2. 使用writev写入到文件:\n");
    fd = open("writev_output.txt", O_CREAT | O_WRONLY | O_TRUNC, 0644);
    if (fd == -1) {
        perror("创建文件失败");
        return -1;
    }
    
    bytes_written = writev(fd, iov, 6);
    if (bytes_written == -1) {
        perror("writev 写入文件失败");
        close(fd);
        return -1;
    }
    printf("  成功写入文件 %zd 字节\n", bytes_written);
    
    close(fd);
    
    // 验证写入结果
    printf("\n3. 验证写入结果:\n");
    fd = open("writev_output.txt", O_RDONLY);
    if (fd != -1) {
        char buffer[1024];
        ssize_t bytes_read = read(fd, buffer, sizeof(buffer) - 1);
        if (bytes_read > 0) {
            buffer[bytes_read] = '\0';
            printf("  文件内容:\n%s", buffer);
        }
        close(fd);
        unlink("writev_output.txt");  // 清理测试文件
    }
    
    return 0;
}

int main() {
    return demo_writev_basic();
}

示例2:性能对比测试

#include <sys/uio.h>
#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <errno.h>
#include <fcntl.h>
#include <time.h>
#include <sys/time.h>

/**
 * 性能测试结构
 */
typedef struct {
    const char *name;
    double writev_time;
    double write_time;
    ssize_t total_bytes;
    int operation_count;
} performance_test_t;

/**
 * 使用writev进行批量写入
 */
ssize_t writev_bulk_write(int fd, const char **messages, int count) {
    struct iovec *iov = malloc(count * sizeof(struct iovec));
    if (!iov) {
        return -1;
    }
    
    ssize_t total_written = 0;
    
    // 准备iovec数组
    for (int i = 0; i < count; i++) {
        iov[i].iov_base = (void*)messages[i];
        iov[i].iov_len = strlen(messages[i]);
    }
    
    // 执行writev写入
    ssize_t result = writev(fd, iov, count);
    if (result != -1) {
        total_written = result;
    }
    
    free(iov);
    return total_written;
}

/**
 * 使用多次write进行批量写入
 */
ssize_t write_bulk_write(int fd, const char **messages, int count) {
    ssize_t total_written = 0;
    
    for (int i = 0; i < count; i++) {
        ssize_t result = write(fd, messages[i], strlen(messages[i]));
        if (result == -1) {
            return -1;
        }
        total_written += result;
    }
    
    return total_written;
}

/**
 * 获取当前时间(微秒)
 */
long long get_current_time_us() {
    struct timeval tv;
    gettimeofday(&tv, NULL);
    return tv.tv_sec * 1000000LL + tv.tv_usec;
}

/**
 * 演示writev性能对比
 */
int demo_writev_performance_comparison() {
    const int message_count = 1000;
    const char *test_messages[1000];
    int fd_writev, fd_write;
    performance_test_t test_results;
    
    printf("=== writev vs write 性能对比测试 ===\n");
    
    // 准备测试消息
    printf("1. 准备测试数据:\n");
    char **message_buffers = malloc(message_count * sizeof(char*));
    if (!message_buffers) {
        perror("分配消息缓冲区失败");
        return -1;
    }
    
    for (int i = 0; i < message_count; i++) {
        message_buffers[i] = malloc(64);
        if (message_buffers[i]) {
            snprintf(message_buffers[i], 64, "Test message %d: Hello World!\n", i + 1);
            test_messages[i] = message_buffers[i];
        } else {
            printf("分配消息 %d 失败\n", i);
            // 清理已分配的缓冲区
            for (int j = 0; j < i; j++) {
                free(message_buffers[j]);
            }
            free(message_buffers);
            return -1;
        }
    }
    
    printf("  准备了 %d 条测试消息\n", message_count);
    
    // 计算总数据量
    size_t total_data_size = 0;
    for (int i = 0; i < message_count; i++) {
        total_data_size += strlen(test_messages[i]);
    }
    printf("  总数据量: %zu 字节 (%.2f KB)\n", total_data_size, total_data_size / 1024.0);
    
    // 创建测试文件
    printf("\n2. 创建测试文件:\n");
    fd_writev = open("writev_test.txt", O_CREAT | O_WRONLY | O_TRUNC, 0644);
    fd_write = open("write_test.txt", O_CREAT | O_WRONLY | O_TRUNC, 0644);
    
    if (fd_writev == -1 || fd_write == -1) {
        perror("创建测试文件失败");
        if (fd_writev != -1) close(fd_writev);
        if (fd_write != -1) close(fd_write);
        // 清理消息缓冲区
        for (int i = 0; i < message_count; i++) {
            free(message_buffers[i]);
        }
        free(message_buffers);
        return -1;
    }
    
    printf("  writev测试文件: writev_test.txt\n");
    printf("  write测试文件: write_test.txt\n");
    
    // writev性能测试
    printf("\n3. writev性能测试:\n");
    long long start_time = get_current_time_us();
    
    ssize_t writev_bytes = writev_bulk_write(fd_writev, test_messages, message_count);
    if (writev_bytes == -1) {
        perror("writev测试失败");
        close(fd_writev);
        close(fd_write);
        unlink("writev_test.txt");
        unlink("write_test.txt");
        // 清理消息缓冲区
        for (int i = 0; i < message_count; i++) {
            free(message_buffers[i]);
        }
        free(message_buffers);
        return -1;
    }
    
    long long end_time = get_current_time_us();
    double writev_time = (end_time - start_time) / 1000.0;  // 转换为毫秒
    
    printf("  writev写入字节数: %zd\n", writev_bytes);
    printf("  writev耗时: %.3f 毫秒\n", writev_time);
    
    close(fd_writev);
    
    // write性能测试
    printf("\n4. write性能测试:\n");
    start_time = get_current_time_us();
    
    ssize_t write_bytes = write_bulk_write(fd_write, test_messages, message_count);
    if (write_bytes == -1) {
        perror("write测试失败");
        close(fd_write);
        unlink("writev_test.txt");
        unlink("write_test.txt");
        // 清理消息缓冲区
        for (int i = 0; i < message_count; i++) {
            free(message_buffers[i]);
        }
        free(message_buffers);
        return -1;
    }
    
    end_time = get_current_time_us();
    double write_time = (end_time - start_time) / 1000.0;  // 转换为毫秒
    
    printf("  write写入字节数: %zd\n", write_bytes);
    printf("  write耗时: %.3f 毫秒\n", write_time);
    
    close(fd_write);
    
    // 显示性能对比结果
    printf("\n5. 性能对比结果:\n");
    printf("  数据总量: %zu 字节\n", total_data_size);
    printf("  writev操作: %d 次系统调用\n", 1);
    printf("  write操作: %d 次系统调用\n", message_count);
    printf("  writev耗时: %.3f 毫秒\n", writev_time);
    printf("  write耗时: %.3f 毫秒\n", write_time);
    
    if (write_time > 0 && writev_time > 0) {
        double speedup = write_time / writev_time;
        double reduction = (write_time - writev_time) / write_time * 100;
        
        printf("  性能提升: %.2f 倍\n", speedup);
        printf("  时间减少: %.1f%%\n", reduction);
    }
    
    // 验证数据一致性
    printf("\n6. 数据一致性验证:\n");
    if (writev_bytes == write_bytes) {
        printf("  ✓ 写入字节数一致\n");
    } else {
        printf("  ✗ 写入字节数不一致 (writev: %zd, write: %zd)\n", writev_bytes, write_bytes);
    }
    
    // 清理测试文件
    unlink("writev_test.txt");
    unlink("write_test.txt");
    
    // 清理消息缓冲区
    for (int i = 0; i < message_count; i++) {
        free(message_buffers[i]);
    }
    free(message_buffers);
    
    // 显示性能分析
    printf("\n=== 性能分析 ===\n");
    printf("1. 系统调用开销:\n");
    printf("   writev减少了 %d 次系统调用\n", message_count - 1);
    printf("   每次系统调用节省约 %.3f 微秒\n", 
           ((write_time - writev_time) * 1000) / (message_count - 1));
    
    printf("\n2. 适用场景:\n");
    printf("   ✓ 大量小数据块写入\n");
    printf("   ✓ 网络协议数据组装\n");
    printf("   ✓ 日志文件批量写入\n");
    printf("   ✓ 数据库事务日志\n");
    
    printf("\n3. 性能优化建议:\n");
    printf("   ✓ 合理设置iovec数组大小\n");
    printf("   ✓ 避免过于频繁的writev调用\n");
    printf("   ✓ 考虑使用异步I/O\n");
    printf("   ✓ 监控系统调用性能\n");
    
    return 0;
}

int main() {
    return demo_writev_performance_comparison();
}

示例3:网络协议数据组装

#include <sys/uio.h>
#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <errno.h>
#include <fcntl.h>
#include <netinet/in.h>
#include <arpa/inet.h>
#include <sys/socket.h>

/**
 * HTTP响应结构
 */
typedef struct {
    int status_code;
    const char *status_text;
    const char *content_type;
    size_t content_length;
    const char *headers[16];
    int header_count;
} http_response_t;

/**
 * 创建HTTP响应头部
 */
int create_http_headers(http_response_t *response, struct iovec *iov, int max_iov) {
    int iov_index = 0;
    
    // 状态行
    char *status_line = malloc(64);
    if (!status_line) return -1;
    snprintf(status_line, 64, "HTTP/1.1 %d %s\r\n", 
             response->status_code, response->status_text);
    iov[iov_index].iov_base = status_line;
    iov[iov_index].iov_len = strlen(status_line);
    iov_index++;
    
    // Content-Type
    char *content_type_header = malloc(64);
    if (!content_type_header) {
        free(status_line);
        return -1;
    }
    snprintf(content_type_header, 64, "Content-Type: %s\r\n", response->content_type);
    iov[iov_index].iov_base = content_type_header;
    iov[iov_index].iov_len = strlen(content_type_header);
    iov_index++;
    
    // Content-Length
    char *content_length_header = malloc(64);
    if (!content_length_header) {
        free(status_line);
        free(content_type_header);
        return -1;
    }
    snprintf(content_length_header, 64, "Content-Length: %zu\r\n", response->content_length);
    iov[iov_index].iov_base = content_length_header;
    iov[iov_index].iov_len = strlen(content_length_header);
    iov_index++;
    
    // 自定义头部
    for (int i = 0; i < response->header_count && iov_index < max_iov - 2; i++) {
        char *custom_header = strdup(response->headers[i]);
        if (custom_header) {
            iov[iov_index].iov_base = custom_header;
            iov[iov_index].iov_len = strlen(custom_header);
            iov_index++;
        }
    }
    
    // 空行分隔符
    char *blank_line = malloc(3);
    if (!blank_line) {
        // 清理已分配的内存
        for (int i = 0; i < iov_index; i++) {
            free(iov[i].iov_base);
        }
        return -1;
    }
    strcpy(blank_line, "\r\n");
    iov[iov_index].iov_base = blank_line;
    iov[iov_index].iov_len = 2;
    iov_index++;
    
    return iov_index;
}

/**
 * 演示网络协议数据组装
 */
int demo_network_protocol_assembly() {
    http_response_t response = {0};
    struct iovec iov[32];
    int iov_count;
    int fd;
    
    printf("=== 网络协议数据组装演示 ===\n");
    
    // 初始化HTTP响应
    printf("1. 初始化HTTP响应:\n");
    response.status_code = 200;
    response.status_text = "OK";
    response.content_type = "application/json";
    response.content_length = 45;
    response.header_count = 2;
    response.headers[0] = "Server: MyWebServer/1.0";
    response.headers[1] = "Cache-Control: no-cache";
    
    printf("  状态码: %d %s\n", response.status_code, response.status_text);
    printf("  内容类型: %s\n", response.content_type);
    printf("  内容长度: %zu 字节\n", response.content_length);
    printf("  自定义头部: %d 个\n", response.header_count);
    for (int i = 0; i < response.header_count; i++) {
        printf("    %s\n", response.headers[i]);
    }
    
    // 创建响应内容
    const char *content = "{\"message\":\"Hello World\",\"status\":\"success\"}";
    
    // 创建HTTP头部
    printf("\n2. 创建HTTP头部:\n");
    iov_count = create_http_headers(&response, iov, 32);
    if (iov_count == -1) {
        printf("创建HTTP头部失败\n");
        return -1;
    }
    
    printf("  创建了 %d 个头部片段\n", iov_count);
    
    // 添加响应内容
    printf("\n3. 添加响应内容:\n");
    if (iov_count < 32) {
        iov[iov_count].iov_base = (void*)content;
        iov[iov_count].iov_len = strlen(content);
        iov_count++;
        printf("  添加响应内容: %s\n", content);
    }
    
    // 显示完整的HTTP响应
    printf("\n4. 完整HTTP响应:\n");
    for (int i = 0; i < iov_count; i++) {
        printf("%.*s", (int)iov[i].iov_len, (char*)iov[i].iov_base);
    }
    
    // 计算总长度
    size_t total_length = 0;
    for (int i = 0; i < iov_count; i++) {
        total_length += iov[i].iov_len;
    }
    printf("\n总响应长度: %zu 字节\n", total_length);
    
    // 写入到文件(模拟网络发送)
    printf("\n5. 写入到文件(模拟网络发送):\n");
    fd = open("http_response.txt", O_CREAT | O_WRONLY | O_TRUNC, 0644);
    if (fd == -1) {
        perror("创建响应文件失败");
        // 清理内存
        for (int i = 0; i < iov_count; i++) {
            free(iov[i].iov_base);
        }
        return -1;
    }
    
    ssize_t bytes_written = writev(fd, iov, iov_count);
    if (bytes_written == -1) {
        perror("writev 写入失败");
        close(fd);
        unlink("http_response.txt");
        // 清理内存
        for (int i = 0; i < iov_count; i++) {
            free(iov[i].iov_base);
        }
        return -1;
    }
    
    printf("  成功写入 %zd 字节到文件\n", bytes_written);
    close(fd);
    
    // 验证写入结果
    printf("\n6. 验证写入结果:\n");
    fd = open("http_response.txt", O_RDONLY);
    if (fd != -1) {
        char buffer[512];
        ssize_t bytes_read = read(fd, buffer, sizeof(buffer) - 1);
        if (bytes_read > 0) {
            buffer[bytes_read] = '\0';
            printf("  文件内容 (%zd 字节):\n", bytes_read);
            printf("%s", buffer);
        }
        close(fd);
        unlink("http_response.txt");
    }
    
    // 清理内存
    for (int i = 0; i < iov_count; i++) {
        free(iov[i].iov_base);
    }
    
    // 显示协议组装优势
    printf("\n=== 协议组装优势 ===\n");
    printf("1. 零拷贝组装:\n");
    printf("   ✓ 避免数据拷贝操作\n");
    printf("   ✓ 减少内存使用\n");
    printf("   ✓ 提高组装效率\n");
    
    printf("\n2. 原子性保证:\n");
    printf("   ✓ 单次系统调用完成\n");
    printf("   ✓ 数据完整性保证\n");
    printf("   ✓ 避免部分写入问题\n");
    
    printf("\n3. 灵活性:\n");
    printf("   ✓ 动态头部组装\n");
    printf("   ✓ 可变内容长度\n");
    printf("   ✓ 复杂协议支持\n");
    
    return 0;
}

int main() {
    return demo_network_protocol_assembly();
}

示例4:日志系统优化

#include <sys/uio.h>
#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <errno.h>
#include <fcntl.h>
#include <time.h>
#include <pthread.h>

/**
 * 日志条目结构
 */
typedef struct {
    time_t timestamp;
    const char *level;
    const char *module;
    const char *message;
    pid_t pid;
    pthread_t tid;
} log_entry_t;

/**
 * 日志缓冲区管理器
 */
typedef struct {
    struct iovec *iov;
    int capacity;
    int count;
    size_t total_size;
    int fd;
} log_buffer_t;

/**
 * 初始化日志缓冲区
 */
int init_log_buffer(log_buffer_t *buffer, int capacity, const char *filename) {
    buffer->iov = malloc(capacity * sizeof(struct iovec));
    if (!buffer->iov) {
        return -1;
    }
    
    buffer->capacity = capacity;
    buffer->count = 0;
    buffer->total_size = 0;
    
    // 打开日志文件
    buffer->fd = open(filename, O_CREAT | O_WRONLY | O_APPEND, 0644);
    if (buffer->fd == -1) {
        free(buffer->iov);
        return -1;
    }
    
    printf("日志缓冲区初始化完成:\n");
    printf("  缓冲区容量: %d 条目\n", capacity);
    printf("  日志文件: %s\n", filename);
    
    return 0;
}

/**
 * 格式化日志条目
 */
char* format_log_entry(const log_entry_t *entry) {
    char *buffer = malloc(512);
    if (!buffer) {
        return NULL;
    }
    
    struct tm *tm_info = localtime(&entry->timestamp);
    char time_str[32];
    strftime(time_str, sizeof(time_str), "%Y-%m-%d %H:%M:%S", tm_info);
    
    snprintf(buffer, 512, "[%s] [%s] [%s] [PID:%d TID:%lu] %s\n",
             time_str, entry->level, entry->module, 
             entry->pid, (unsigned long)entry->tid, entry->message);
    
    return buffer;
}

/**
 * 添加日志条目到缓冲区
 */
int add_log_entry(log_buffer_t *buffer, const log_entry_t *entry) {
    if (buffer->count >= buffer->capacity) {
        printf("日志缓冲区已满\n");
        return -1;
    }
    
    char *formatted_entry = format_log_entry(entry);
    if (!formatted_entry) {
        return -1;
    }
    
    buffer->iov[buffer->count].iov_base = formatted_entry;
    buffer->iov[buffer->count].iov_len = strlen(formatted_entry);
    
    buffer->total_size += buffer->iov[buffer->count].iov_len;
    buffer->count++;
    
    return 0;
}

/**
 * 冲刷日志缓冲区
 */
int flush_log_buffer(log_buffer_t *buffer) {
    if (buffer->count == 0) {
        return 0;
    }
    
    printf("冲刷日志缓冲区: %d 条目, %zu 字节\n", buffer->count, buffer->total_size);
    
    ssize_t bytes_written = writev(buffer->fd, buffer->iov, buffer->count);
    if (bytes_written == -1) {
        perror("写入日志失败");
        return -1;
    }
    
    printf("  成功写入 %zd 字节\n", bytes_written);
    
    // 清理内存
    for (int i = 0; i < buffer->count; i++) {
        free(buffer->iov[i].iov_base);
    }
    
    buffer->count = 0;
    buffer->total_size = 0;
    
    return 0;
}

/**
 * 演示日志系统优化
 */
int demo_log_system_optimization() {
    log_buffer_t log_buffer;
    const int buffer_capacity = 100;
    const char *log_filename = "optimized_log.txt";
    
    printf("=== 日志系统优化演示 ===\n");
    
    // 初始化日志缓冲区
    printf("1. 初始化日志缓冲区:\n");
    if (init_log_buffer(&log_buffer, buffer_capacity, log_filename) != 0) {
        printf("初始化日志缓冲区失败\n");
        return -1;
    }
    
    // 生成测试日志条目
    printf("\n2. 生成测试日志条目:\n");
    const char *modules[] = {"Database", "Network", "Cache", "Security", "Monitor"};
    const char *levels[] = {"DEBUG", "INFO", "WARN", "ERROR"};
    const char *messages[] = {
        "Operation completed successfully",
        "Data synchronized with remote server",
        "Cache hit ratio improved to 95%",
        "Security scan completed without issues",
        "Performance metrics updated"
    };
    
    srand(time(NULL));
    
    // 模拟日志生成
    printf("  生成日志条目:\n");
    for (int i = 0; i < 15; i++) {
        log_entry_t entry;
        entry.timestamp = time(NULL);
        entry.level = levels[rand() % 4];
        entry.module = modules[rand() % 5];
        entry.message = messages[rand() % 5];
        entry.pid = getpid();
        entry.tid = pthread_self();
        
        if (add_log_entry(&log_buffer, &entry) == 0) {
            printf("    [%s] [%s] %s\n", entry.level, entry.module, entry.message);
        } else {
            printf("    添加日志条目失败\n");
            break;
        }
        
        // 模拟批量冲刷
        if ((i + 1) % 5 == 0) {
            printf("  批量冲刷日志 (%d 条目)\n", log_buffer.count);
            flush_log_buffer(&log_buffer);
        }
    }
    
    // 最终冲刷
    printf("\n3. 最终冲刷剩余日志:\n");
    flush_log_buffer(&log_buffer);
    
    // 验证日志文件
    printf("\n4. 验证日志文件:\n");
    int fd = open(log_filename, O_RDONLY);
    if (fd != -1) {
        char buffer[2048];
        ssize_t bytes_read = read(fd, buffer, sizeof(buffer) - 1);
        if (bytes_read > 0) {
            buffer[bytes_read] = '\0';
            printf("  日志文件内容 (%zd 字节):\n", bytes_read);
            
            // 只显示前几行
            char *line = strtok(buffer, "\n");
            int line_count = 0;
            while (line && line_count < 10) {
                printf("    %s\n", line);
                line = strtok(NULL, "\n");
                line_count++;
            }
            if (line) {
                printf("    ... (还有更多日志)\n");
            }
        }
        close(fd);
        unlink(log_filename);
    }
    
    // 清理资源
    free(log_buffer.iov);
    close(log_buffer.fd);
    
    // 显示优化效果
    printf("\n=== 日志系统优化效果 ===\n");
    printf("1. 性能提升:\n");
    printf("   ✓ 减少系统调用次数\n");
    printf("   ✓ 提高批量写入效率\n");
    printf("   ✓ 降低I/O开销\n");
    
    printf("\n2. 资源优化:\n");
    printf("   ✓ 减少内存分配次数\n");
    printf("   ✓ 优化缓冲区使用\n");
    printf("   ✓ 提高磁盘I/O效率\n");
    
    printf("\n3. 可靠性提升:\n");
    printf("   ✓ 原子性日志写入\n");
    printf("   ✓ 错误处理机制\n");
    printf("   ✓ 数据完整性保证\n");
    
    return 0;
}

int main() {
    return demo_log_system_optimization();
}

示例5:数据库事务日志

#include <sys/uio.h>
#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <errno.h>
#include <fcntl.h>
#include <time.h>
#include <sys/stat.h>

/**
 * 事务操作类型
 */
typedef enum {
    TXN_BEGIN = 1,
    TXN_COMMIT = 2,
    TXN_ROLLBACK = 3,
    TXN_INSERT = 4,
    TXN_UPDATE = 5,
    TXN_DELETE = 6
} txn_operation_t;

/**
 * 事务日志条目
 */
typedef struct {
    txn_operation_t operation;
    time_t timestamp;
    unsigned long transaction_id;
    const char *table_name;
    const char *data;
    size_t data_size;
} txn_log_entry_t;

/**
 * 事务日志管理器
 */
typedef struct {
    int log_fd;
    char log_filename[256];
    struct iovec *iov_buffer;
    int buffer_capacity;
    int buffer_count;
    size_t buffer_size;
    unsigned long current_txn_id;
} txn_log_manager_t;

/**
 * 初始化事务日志管理器
 */
int init_txn_log_manager(txn_log_manager_t *manager, const char *log_dir) {
    // 创建日志目录
    struct stat st = {0};
    if (stat(log_dir, &st) == -1) {
        if (mkdir(log_dir, 0755) == -1) {
            perror("创建日志目录失败");
            return -1;
        }
    }
    
    // 初始化管理器
    snprintf(manager->log_filename, sizeof(manager->log_filename), 
             "%s/transaction.log", log_dir);
    
    manager->log_fd = open(manager->log_filename, 
                          O_CREAT | O_WRONLY | O_APPEND, 0644);
    if (manager->log_fd == -1) {
        perror("创建事务日志文件失败");
        return -1;
    }
    
    manager->buffer_capacity = 64;
    manager->iov_buffer = malloc(manager->buffer_capacity * sizeof(struct iovec));
    if (!manager->iov_buffer) {
        close(manager->log_fd);
        return -1;
    }
    
    manager->buffer_count = 0;
    manager->buffer_size = 0;
    manager->current_txn_id = time(NULL);  // 简单的事务ID生成
    
    printf("事务日志管理器初始化完成:\n");
    printf("  日志文件: %s\n", manager->log_filename);
    printf("  缓冲区容量: %d\n", manager->buffer_capacity);
    printf("  初始事务ID: %lu\n", manager->current_txn_id);
    
    return 0;
}

/**
 * 格式化事务日志条目
 */
char* format_txn_log_entry(const txn_log_entry_t *entry) {
    char *buffer = malloc(512);
    if (!buffer) {
        return NULL;
    }
    
    struct tm *tm_info = localtime(&entry->timestamp);
    char time_str[32];
    strftime(time_str, sizeof(time_str), "%Y-%m-%d %H:%M:%S", tm_info);
    
    const char *op_names[] = {
        "", "BEGIN", "COMMIT", "ROLLBACK", "INSERT", "UPDATE", "DELETE"
    };
    
    snprintf(buffer, 512, "[%s] TXN:%lu OP:%s TABLE:%s SIZE:%zu DATA:%s\n",
             time_str, entry->transaction_id, 
             op_names[entry->operation],
             entry->table_name ? entry->table_name : "N/A",
             entry->data_size,
             entry->data ? entry->data : "");
    
    return buffer;
}

/**
 * 添加事务日志条目
 */
int add_txn_log_entry(txn_log_manager_t *manager, const txn_log_entry_t *entry) {
    if (manager->buffer_count >= manager->buffer_capacity) {
        printf("事务日志缓冲区已满,需要冲刷\n");
        if (flush_txn_log_buffer(manager) != 0) {
            return -1;
        }
    }
    
    char *formatted_entry = format_txn_log_entry(entry);
    if (!formatted_entry) {
        return -1;
    }
    
    manager->iov_buffer[manager->buffer_count].iov_base = formatted_entry;
    manager->iov_buffer[manager->buffer_count].iov_len = strlen(formatted_entry);
    
    manager->buffer_size += manager->iov_buffer[manager->buffer_count].iov_len;
    manager->buffer_count++;
    
    printf("添加事务日志条目: TXN:%lu OP:%d\n", 
           entry->transaction_id, entry->operation);
    
    return 0;
}

/**
 * 冲刷事务日志缓冲区
 */
int flush_txn_log_buffer(txn_log_manager_t *manager) {
    if (manager->buffer_count == 0) {
        return 0;
    }
    
    printf("冲刷事务日志缓冲区: %d 条目, %zu 字节\n", 
           manager->buffer_count, manager->buffer_size);
    
    ssize_t bytes_written = writev(manager->log_fd, manager->iov_buffer, manager->buffer_count);
    if (bytes_written == -1) {
        perror("写入事务日志失败");
        return -1;
    }
    
    printf("  成功写入 %zd 字节事务日志\n", bytes_written);
    
    // 清理内存
    for (int i = 0; i < manager->buffer_count; i++) {
        free(manager->iov_buffer[i].iov_base);
    }
    
    manager->buffer_count = 0;
    manager->buffer_size = 0;
    
    // 同步到磁盘
    fsync(manager->log_fd);
    
    return 0;
}

/**
 * 演示数据库事务日志
 */
int demo_database_transaction_log() {
    txn_log_manager_t log_manager;
    const char *log_directory = "./txn_logs";
    
    printf("=== 数据库事务日志演示 ===\n");
    
    // 初始化事务日志管理器
    printf("1. 初始化事务日志管理器:\n");
    if (init_txn_log_manager(&log_manager, log_directory) != 0) {
        printf("初始化事务日志管理器失败\n");
        return -1;
    }
    
    // 模拟数据库事务操作
    printf("\n2. 模拟数据库事务操作:\n");
    
    // 事务1: 插入操作
    printf("  事务1: 数据插入操作\n");
    txn_log_entry_t insert_entry = {
        .operation = TXN_BEGIN,
        .timestamp = time(NULL),
        .transaction_id = log_manager.current_txn_id++,
        .table_name = "users",
        .data = "{'name':'John','email':'john@example.com'}",
        .data_size = 45
    };
    add_txn_log_entry(&log_manager, &insert_entry);
    
    txn_log_entry_t insert_data = {
        .operation = TXN_INSERT,
        .timestamp = time(NULL),
        .transaction_id = insert_entry.transaction_id,
        .table_name = "users",
        .data = "INSERT INTO users VALUES ('John', 'john@example.com')",
        .data_size = 52
    };
    add_txn_log_entry(&log_manager, &insert_data);
    
    txn_log_entry_t commit_entry = {
        .operation = TXN_COMMIT,
        .timestamp = time(NULL),
        .transaction_id = insert_entry.transaction_id,
        .table_name = "users",
        .data = "Transaction committed successfully",
        .data_size = 32
    };
    add_txn_log_entry(&log_manager, &commit_entry);
    
    // 冲刷第一个事务
    flush_txn_log_buffer(&log_manager);
    
    // 事务2: 更新操作
    printf("\n  事务2: 数据更新操作\n");
    txn_log_entry_t update_entry = {
        .operation = TXN_BEGIN,
        .timestamp = time(NULL),
        .transaction_id = log_manager.current_txn_id++,
        .table_name = "orders",
        .data = "{'order_id':12345,'status':'processing'}",
        .data_size = 42
    };
    add_txn_log_entry(&log_manager, &update_entry);
    
    txn_log_entry_t update_data = {
        .operation = TXN_UPDATE,
        .timestamp = time(NULL),
        .transaction_id = update_entry.transaction_id,
        .table_name = "orders",
        .data = "UPDATE orders SET status='shipped' WHERE order_id=12345",
        .data_size = 55
    };
    add_txn_log_entry(&log_manager, &update_data);
    
    // 事务3: 删除操作(在同一事务中)
    txn_log_entry_t delete_data = {
        .operation = TXN_DELETE,
        .timestamp = time(NULL),
        .transaction_id = update_entry.transaction_id,
        .table_name = "cart_items",
        .data = "DELETE FROM cart_items WHERE user_id=1001",
        .data_size = 43
    };
    add_txn_log_entry(&log_manager, &delete_data);
    
    txn_log_entry_t commit_update = {
        .operation = TXN_COMMIT,
        .timestamp = time(NULL),
        .transaction_id = update_entry.transaction_id,
        .table_name = "orders",
        .data = "Multi-table transaction committed",
        .data_size = 35
    };
    add_txn_log_entry(&log_manager, &commit_update);
    
    // 事务4: 回滚操作
    printf("\n  事务3: 事务回滚操作\n");
    txn_log_entry_t rollback_entry = {
        .operation = TXN_BEGIN,
        .timestamp = time(NULL),
        .transaction_id = log_manager.current_txn_id++,
        .table_name = "inventory",
        .data = "Stock adjustment transaction",
        .data_size = 28
    };
    add_txn_log_entry(&log_manager, &rollback_entry);
    
    txn_log_entry_t error_entry = {
        .operation = TXN_UPDATE,
        .timestamp = time(NULL),
        .transaction_id = rollback_entry.transaction_id,
        .table_name = "inventory",
        .data = "UPDATE inventory SET quantity=-5 WHERE product_id=100",  // 错误:负库存
        .data_size = 57
    };
    add_txn_log_entry(&log_manager, &error_entry);
    
    txn_log_entry_t rollback_txn = {
        .operation = TXN_ROLLBACK,
        .timestamp = time(NULL),
        .transaction_id = rollback_entry.transaction_id,
        .table_name = "inventory",
        .data = "Rollback due to negative stock adjustment",
        .data_size = 45
    };
    add_txn_log_entry(&log_manager, &rollback_txn);
    
    // 最终冲刷
    printf("\n3. 最终冲刷剩余日志:\n");
    flush_txn_log_buffer(&log_manager);
    
    // 验证日志文件
    printf("\n4. 验证事务日志文件:\n");
    int fd = open(log_manager.log_filename, O_RDONLY);
    if (fd != -1) {
        char buffer[2048];
        ssize_t bytes_read = read(fd, buffer, sizeof(buffer) - 1);
        if (bytes_read > 0) {
            buffer[bytes_read] = '\0';
            printf("  事务日志内容 (%zd 字节):\n", bytes_read);
            
            // 显示日志内容
            char *line = strtok(buffer, "\n");
            while (line) {
                printf("    %s\n", line);
                line = strtok(NULL, "\n");
            }
        }
        close(fd);
    }
    
    // 清理资源
    if (log_manager.iov_buffer) {
        // 清理可能残留的缓冲区条目
        for (int i = 0; i < log_manager.buffer_count; i++) {
            free(log_manager.iov_buffer[i].iov_base);
        }
        free(log_manager.iov_buffer);
    }
    if (log_manager.log_fd != -1) {
        close(log_manager.log_fd);
    }
    
    // 清理日志文件
    unlink(log_manager.log_filename);
    rmdir(log_directory);
    
    // 显示事务日志优势
    printf("\n=== 事务日志优势 ===\n");
    printf("1. ACID特性保证:\n");
    printf("   ✓ 原子性: 事务操作要么全部成功要么全部失败\n");
    printf("   ✓ 一致性: 系统从一个一致状态转换到另一个一致状态\n");
    printf("   ✓ 隔离性: 并发事务之间相互隔离\n");
    printf("   ✓ 持久性: 事务提交后对数据的修改是永久性的\n");
    
    printf("\n2. 性能优化:\n");
    printf("   ✓ 批量日志写入减少I/O操作\n");
    printf("   ✓ 零拷贝数据组装提高效率\n");
    printf("   ✓ 缓冲机制减少系统调用开销\n");
    printf("   ✓ 异步写入提高响应速度\n");
    
    printf("\n3. 可靠性保障:\n");
    printf("   ✓ 完整的事务轨迹记录\n");
    printf("   ✓ 快速故障恢复能力\n");
    printf("   ✓ 数据一致性验证\n");
    printf("   ✓ 审计和合规支持\n");
    
    return 0;
}

int main() {
    return demo_database_transaction_log();
}

writev 使用注意事项

系统限制:

  1. IOV_MAX: 系统限制iovec数组的最大长度(通常为1024)
  2. 单次写入限制: 一次writev调用写入的总字节数有限制
  3. 文件描述符类型: 不是所有文件描述符都支持writev

错误处理:

  1. 部分写入: 可能只写入部分数据
  2. 错误恢复: 需要处理部分成功的场景
  3. 资源清理: 失败时需要清理已分配的资源

性能考虑:

  1. 缓冲区大小: 合理设置iovec数组大小
  2. 内存对齐: 考虑内存对齐优化
  3. 批量操作: 尽量批量处理减少系统调用

安全考虑:

  1. 缓冲区验证: 验证iovec缓冲区的有效性
  2. 权限检查: 确保有适当的文件写入权限
  3. 数据完整性: 确保写入数据的完整性

最佳实践:

  1. 合理使用: 在合适的场景下使用writev
  2. 错误处理: 妥善处理各种错误情况
  3. 资源管理: 及时释放分配的资源
  4. 性能监控: 监控writev的性能表现

writev vs 相似函数对比

writev vs write:

// write: 单缓冲区写入
char buffer[1024];
write(fd, buffer, sizeof(buffer));

// writev: 多缓冲区写入
struct iovec iov[3];
iov[0].iov_base = buffer1; iov[0].iov_len = len1;
iov[1].iov_base = buffer2; iov[1].iov_len = len2;
iov[2].iov_base = buffer3; iov[2].iov_len = len3;
writev(fd, iov, 3);

writev vs sendmsg:

// writev: 简单的分散写入
writev(sockfd, iov, iovcnt);

// sendmsg: 带控制信息的分散写入
struct msghdr msg = {0};
msg.msg_iov = iov;
msg.msg_iovlen = iovcnt;
msg.msg_control = control_data;
sendmsg(sockfd, &msg, flags);

常见使用场景

1. 网络协议组装:

// HTTP响应组装
struct iovec iov[5];
iov[0].iov_base = status_line; iov[0].iov_len = strlen(status_line);
iov[1].iov_base = headers; iov[1].iov_len = strlen(headers);
iov[2].iov_base = blank_line; iov[2].iov_len = 2;
iov[3].iov_base = content; iov[3].iov_len = content_len;
iov[4].iov_base = trailers; iov[4].iov_len = strlen(trailers);
writev(client_fd, iov, 5);

2. 日志系统优化:

// 批量日志写入
struct iovec log_entries[100];
// ... 填充日志条目
writev(log_fd, log_entries, entry_count);

3. 数据库事务日志:

// 事务操作日志
struct iovec txn_records[10];
// ... 填充事务记录
writev(txn_log_fd, txn_records, record_count);

总结

writev 是Linux系统中重要的分散写入函数,提供了:

  1. 高效性: 减少系统调用次数,提高I/O性能
  2. 灵活性: 支持多个不连续缓冲区的原子写入
  3. 标准兼容: 符合POSIX标准,广泛支持
  4. 应用场景广: 适用于网络编程、日志系统、数据库等场景

通过合理使用 writev,可以构建高性能的数据处理和传输系统。在实际应用中,需要注意系统限制、错误处理和性能优化等问题。

发表在 linux文章 | 留下评论

write系统调用及示例

我们来介绍与 read 和 open 紧密配合使用的 write 函数。如果说 read 是从文件描述符“拿”数据,那么 write 就是向文件描述符“放”数据。


1. 函数介绍

write 是一个 Linux 系统调用,用于将数据从程序的缓冲区写入到由文件描述符 fd 指定的文件、管道、套接字或其他输出流中。它是程序向外部(如文件、屏幕、网络)发送数据的基本方式之一。


2. 函数原型

#include <unistd.h>

ssize_t write(int fd, const void *buf, size_t count);

3. 功能

尝试将 count 个字节的数据从 buf 指向的缓冲区写入到文件描述符 fd 所关联的文件或资源中。


4. 参数

  • int fd: 这是目标文件描述符。它标识了数据要写入的目标,比如通过 open 打开的文件、标准输出 (STDOUT_FILENO)、标准错误 (STDERR_FILENO) 或一个网络套接字等。
  • const void *buf: 这是一个指向包含待写入数据的缓冲区的指针。由于数据是从这个缓冲区读取并写出去的,所以指针被声明为 const,表明函数不会修改这块内存中的数据。
  • size_t count: 这是要写入的字节数。函数会尝试写入从 buf 开始的 count 个字节。

5. 返回值

write 函数返回实际成功写入的字节数。

  • 成功时:
    • 返回实际写入的字节数 (0 <= 返回值 <= count)。
    • 在阻塞模式下,通常返回值会等于 count。但在某些情况下(如写入管道或网络套接字时缓冲区满),返回值可能小于 count。这时,程序通常需要循环调用 write 来写入剩余的数据。
  • 出错时:
    • 返回 -1,并设置全局变量 errno 来指示具体的错误类型(例如 EAGAINEBADFEFAULTENOSPC 等)。

重要提示绝对不能仅仅通过检查 write 的返回值是否为 -1 来判断写操作是否完全成功。必须检查返回值是否等于请求写入的字节数 count,或者在返回值小于 count 时采取相应措施(如循环写入)。


6. 相似函数,或关联函数

  • pwrite: 类似于 write,但它允许你在一次调用中同时指定要写入的文件描述符、缓冲区、写入字节数以及文件内的偏移量。它不会改变文件的当前偏移量。
  • writev: 允许你将多个不连续缓冲区(一个 iovec 数组)中的数据写入到文件描述符中,这对于需要拼接发送多个数据块的场景非常有用。
  • read: 与 write 相反,用于从文件描述符读取数据。
  • open: 通常在调用 write 之前使用,用来获取要写入文件的文件描述符。

7. 示例代码

示例 1:将数据写入文件

这个例子演示如何创建(或截断)一个文件,并将一些文本数据写入其中。

#include <unistd.h>  // write, close
#include <fcntl.h>   // open, O_WRONLY, O_CREAT, O_TRUNC
#include <stdio.h>   // perror, printf
#include <stdlib.h>  // exit
#include <string.h>  // strlen
#include <errno.h>   // errno

int main() {
    int fd;                    // 文件描述符
    const char *message = "Hello, Linux System Programming!\n";
    size_t message_len = strlen(message); // 获取要写入的字节数
    ssize_t bytes_written;    // 实际写入的字节数

    // 1. 打开(或创建)一个文件用于写入
    // O_WRONLY: 只写模式
    // O_CREAT: 如果文件不存在则创建
    // O_TRUNC: 如果文件存在,则截断(清空)它
    // 0644 是新创建文件的权限 (所有者读写,组和其他用户只读)
    fd = open("output.txt", O_WRONLY | O_CREAT | O_TRUNC, 0644);
    if (fd == -1) {
        perror("Error opening/creating file");
        exit(EXIT_FAILURE);
    }

    printf("File 'output.txt' opened/created successfully with fd: %d\n", fd);

    // 2. 调用 write 将数据写入文件
    bytes_written = write(fd, message, message_len);

    // 3. 检查 write 的返回值
    if (bytes_written == -1) {
        perror("Error writing to file");
        close(fd); // 出错也要记得关闭文件
        exit(EXIT_FAILURE);
    } else if ((size_t)bytes_written != message_len) {
        // 检查是否所有数据都被写入 (在简单场景下通常如此,但好习惯是检查)
        fprintf(stderr, "Warning: Only %zd of %zu bytes written to file.\n", bytes_written, message_len);
        // 在实际应用中,可能需要循环 write 来处理这种情况
    } else {
         printf("Successfully wrote %zd bytes to 'output.txt'.\n", bytes_written);
    }

    // 4. 关闭文件描述符
    if (close(fd) == -1) {
        perror("Error closing file");
        exit(EXIT_FAILURE);
    }

    printf("File closed. Check 'output.txt' for the content.\n");
    return 0;
}

代码解释:

  1. 定义要写入的字符串 message 和其长度 message_len
  2. 使用 open() 以只写模式 (O_WRONLY) 打开或创建 output.txt 文件。如果文件不存在则创建 (O_CREAT),如果存在则清空 (O_TRUNC)。权限设置为 0644
  3. 调用 write(fd, message, message_len) 尝试将整个消息写入文件。
  4. 检查 write 的返回值:-1 表示错误;如果返回值不等于 message_len,则表示未完全写入(在此简单场景下不太可能发生,但展示了检查的必要性);否则表示成功写入。
  5. 最后,使用 close() 关闭文件描述符。

示例 2:向标准错误输出写入错误信息

这个例子展示了如何使用 write 向标准错误 (STDERR_FILENO) 写入信息,这通常用于输出错误或诊断信息,与标准输出分开。

#include <unistd.h>  // write
#include <string.h>  // strlen

int main() {
    const char *error_msg = "An error occurred in the program.\n";

    // 直接向标准错误 (文件描述符 2) 写入错误信息
    // 注意:这里也没有处理 write 可能部分写入的情况,
    // 对于短消息写入 stderr 通常可以假设一次性成功,
    // 但在严格要求下仍需检查返回值。
    write(STDERR_FILENO, error_msg, strlen(error_msg));

    return 0; // 程序正常退出,但 stderr 上有错误信息
}

代码解释:

  1. 定义错误信息字符串。
  2. 直接调用 write(STDERR_FILENO, error_msg, strlen(error_msg)) 将错误信息写入标准错误输出。STDERR_FILENO 是预定义的常量,值为 2。
发表在 linux文章 | 留下评论

Uchardet Code Analysis: nsUTF8Prober Confidence Calculation

Code Analysis: nsUTF8Prober Confidence Calculation

1. Core Logic

The detector’s core principle is: ​​Verify UTF-8 encoding rules to determine if text is UTF-8​​. It uses a state machine (mCodingSM) to track byte sequence compliance with UTF-8 specifications.

  • Reset()​: Initializes detector state, resets state machine, multi-byte character counter (mNumOfMBChar), and detection state (mState).
  • HandleData()​: Primary function for processing input byte streams:
    • Processes bytes sequentially through the state machine (mCodingSM->NextState(aBuf[i]))
    • eItsMe state return indicates definite UTF-8 rule violation → detector state becomes eFoundIt (effectively “confirmed not UTF-8”)
    • eStart state return indicates successful recognition of a complete UTF-8 character:
      • For multi-byte characters (mCodingSM->GetCurrentCharLen() >= 2), increments mNumOfMBChar
      • Includes logic to build Unicode code points (currentCodePoint) stored in codePointBuffer
    • ​Key optimization​​: At HandleData‘s end: if (mState == eDetecting) if (mNumOfMBChar > ENOUGH_CHAR_THRESHOLD && GetConfidence(0) > SHORTCUT_THRESHOLD) mState = eFoundIt; This allows early termination when sufficient valid multi-byte characters are found (mNumOfMBChar > 256) with high confidence.

2. Confidence Calculation (GetConfidence)

Core calculation logic:

#define ONE_CHAR_PROB   (float)0.50

float nsUTF8Prober::GetConfidence(int candidate)
{
  if (mNumOfMBChar < 6)  // Fewer than 6 multi-byte characters
  {
    float unlike = 0.5f; // Initial 50% probability of not being UTF-8
    
    // Each valid multi-byte character has 50% probability of being coincidental
    // Combined probability for N characters: (0.5)^N
    for (PRUint32 i = 0; i < mNumOfMBChar; i++)
      unlike *= ONE_CHAR_PROB; // Multiply by 0.5 per character
    
    // Confidence = 1 - probability of coincidence
    return (float)1.0 - unlike;
  }
  else  // 6+ multi-byte characters
  {
    return (float)0.99; // High-confidence threshold
  }
}

3. Confidence Calculation Methodology

The algorithm uses a ​​statistical significance heuristic​​:

  1. ​Low-Confidence Mode (<6 MB characters)​​:
    • Models probability that N valid UTF-8 sequences appear coincidentally in non-UTF8 text as (0.5)^N
    • ONE_CHAR_PROB=0.5 is an empirical estimate of random byte sequences accidentally matching UTF-8 rules
    • Confidence = 1 - (0.5)^N
    • ​Examples​​:
      • 0 MB chars: 50% confidence
      • 1 MB char: 75% confidence
      • 3 MB chars: 93.75% confidence
      • 5 MB chars: 98.4375% confidence
  2. ​High-Confidence Mode (≥6 MB characters)​​:
    • Returns fixed 99% confidence
    • Optimization based on empirical observation that 6 valid sequences provide near-certain detection
    • Minimizes false positives while maintaining efficiency

4. Key Characteristics

AspectDescription
​Detection Basis​Multi-byte character count (mNumOfMBChar)
​Calculation Approach​Statistical model of coincidental matches
​Probability Constant​Empirical value (0.5)
​Threshold​6 multi-byte characters
​Strengths​Simple computation, fast rejection of invalid sequences
​Detection Philosophy​Focuses on disproving non-UTF8 through rule validation

5. Practical Implications

  • ​Short text sensitivity​​: Confidence builds slowly with character count
  • ​Language dependence​​: More effective for languages requiring frequent multi-byte characters
  • ​Error resilience​​: Single invalid sequence resets confidence building
  • ​Performance tradeoff​​: Threshold value balances accuracy vs processing time

This confidence model exemplifies Uchardet’s practical approach – using statistically-informed heuristics to achieve efficient encoding detection without complex probabilistic modeling. The 0.5 probability constant and 6-character threshold represent carefully balanced empirical values refined through real-world testing.

发表在 linux文章 | 留下评论

LLC协议格式

IEEE 802.2标准定义了逻辑链路控制(LLC,Logical Link Control)协议,它是数据链路层的一部分,主要用于管理和控制局域网(LAN)中的数据传输。LLC层位于MAC(介质访问控制)子层之上,负责向网络层提供统一的接口,并屏蔽底层物理网络的差异。LLC协议的设计目标是为不同的网络协议(如IP、IPX等)提供通用的数据传输服务。

LLC协议的核心功能包括帧格式定义、流量控制、差错控制以及协议多路复用。LLC帧格式通常由三个主要字段组成:

1. **DSAP(Destination Service Access Point)**:表示目标服务访问点,用于标识接收方的上层协议。
2. **SSAP(Source Service Access Point)**:表示源服务访问点,用于标识发送方的上层协议。
3. **控制字段(Control Field)**:用于定义帧的类型(如信息帧、监督帧或无编号帧)以及相关的控制信息,例如序列号、确认号等。

LLC协议支持三种主要的帧类型:
– **信息帧(I帧)**:用于传输用户数据,并包含序列号和确认号,支持可靠的面向连接的数据传输。
– **监督帧(S帧)**:用于流量控制和差错控制,但不携带用户数据。
– **无编号帧(U帧)**:用于建立和释放连接,或执行其他控制功能,例如测试和诊断。

LLC协议通过这些机制实现了对数据链路层的抽象,使得上层协议可以独立于具体的物理网络技术进行通信。例如,在IEEE 802.3以太网中,LLC层可以与MAC子层结合使用,通过802.2 LLC头部封装IP数据报,从而实现与上层协议的交互。

此外,LLC协议还支持无连接和面向连接两种操作模式。在无连接模式下,数据帧直接发送而不需要建立连接;而在面向连接模式下,LLC协议会在数据传输前建立连接,并在传输结束后释放连接,以确保可靠的数据传输。

### LLC与Ethernet II的区别
LLC协议通常与IEEE 802.3标准一起使用,而Ethernet II则采用了一种更简单的帧格式。Ethernet II帧中没有LLC头部,而是直接使用类型字段(EtherType)来标识上层协议(如IPv4、ARP等)。相比之下,IEEE 802.3帧需要通过LLC头部来标识上层协议,因此在实际应用中,Ethernet II更为常见。

以下是一个典型的LLC帧封装示例(基于IEEE 802.3帧结构):

“`plaintext
+—————-+—————-+—————-+—————-+—————-+
| 前导码(Preamble) | 目的MAC地址 | 源MAC地址 | 长度/类型字段 | 数据字段 |
+—————-+—————-+—————-+—————-+—————-+
| CRC校验字段 | DSAP | SSAP | 控制字段 | 上层协议数据 |
+—————-+—————-+—————-+—————-+—————-+
“`

在实际网络中,LLC协议的使用频率有所下降,尤其是在现代以太网环境中,Ethernet II因其简洁性和高效性而被广泛采用。然而,在某些传统网络协议(如NetBIOS、IPX/SPX)中,LLC仍然发挥着重要作用。

发表在 linux文章 | 留下评论

Advanced C Programming Techniques and Best Practices – Complete Edition


Advanced C Programming Techniques and Best Practices – Complete Edition

Table of Contents

  1. Macro Definitions and Preprocessing Techniques
  2. Advanced Memory Management Techniques
  3. Function Pointers and Callback Mechanisms
  4. Data Structure Design
  5. Concurrency and Multithreading
  6. Error Handling and Exception Mechanisms
  7. Performance Optimization Techniques
  8. Debugging and Testing Techniques
  9. Cross-Platform Programming
  10. Secure Programming Practices
  11. Comprehensive Demonstration Examples

Macro Definitions and Preprocessing Techniques

1. Conditional Compilation and Platform Detection

/**
 * Platform and compiler detection
 * Used for conditional compilation and cross-platform compatibility
 */
#if defined(_WIN32) || defined(_WIN64)
    #define PLATFORM_WINDOWS
#elif defined(__linux__)
    #define PLATFORM_LINUX
#elif defined(__APPLE__)
    #define PLATFORM_MACOS
#endif

// Compiler detection
#if defined(__GNUC__)
    #define COMPILER_GCC
#elif defined(_MSC_VER)
    #define COMPILER_MSVC
#endif

// Version detection
#if __STDC_VERSION__ >= 201112L
    #define C11_SUPPORTED
#endif

2. Advanced Macro Techniques

/**
 * Advanced macro collection
 * Provides type-safe and convenient macro utilities
 */

// Stringification and concatenation
#define STRINGIFY(x) #x
#define TOSTRING(x) STRINGIFY(x)
#define CONCAT(a, b) a##b

// Get array size
#define ARRAY_SIZE(arr) (sizeof(arr) / sizeof((arr)[0]))

// container_of macro (get container pointer from member pointer)
#define container_of(ptr, type, member) ({          \
    void *__mptr = (void *)(ptr);                    \
    ((type *)(__mptr - offsetof(type, member))); })

// Min/Max values
#define MAX(a, b) ((a) > (b) ? (a) : (b))
#define MIN(a, b) ((a) < (b) ? (a) : (b))

// Variable swap (without temporary variable)
#define SWAP(a, b) do { typeof(a) temp = a; a = b; b = temp; } while(0)

// Compile-time assertion
#define STATIC_ASSERT(condition, message) \
    typedef char static_assertion_##message[(condition) ? 1 : -1]

// Variadic macros
#define DEBUG_PRINT(fmt, ...) \
    fprintf(stderr, "[DEBUG] %s:%d: " fmt "\n", __FILE__, __LINE__, ##__VA_ARGS__)

3. Modern C Language Features

/**
 * C11 modern features
 * Leverage new standards to improve code quality
 */

// Generic selection
#define generic_max(a, b) _Generic((a), \
    int: max_int, \
    float: max_float, \
    double: max_double \
)(a, b)

// Static assertion (C11)
_Static_assert(sizeof(int) >= 4, "int must be at least 4 bytes");

// Thread-local storage (C11)
_Thread_local int thread_var;

Advanced Memory Management Techniques

1. Memory Pool Design

/**
 * Memory pool implementation
 * Provides efficient memory allocation and recycling
 */
typedef struct {
    void *memory;
    size_t size;
    size_t used;
    size_t block_size;
} memory_pool_t;

memory_pool_t* create_pool(size_t size, size_t block_size) {
    memory_pool_t *pool = malloc(sizeof(memory_pool_t));
    pool->memory = malloc(size);
    pool->size = size;
    pool->used = 0;
    pool->block_size = block_size;
    return pool;
}

void* pool_alloc(memory_pool_t *pool, size_t size) {
    if (pool->used + size > pool->size) return NULL;
    void *ptr = (char*)pool->memory + pool->used;
    pool->used += size;
    return ptr;
}

2. Smart Pointer Simulation

/**
 * Smart pointer simulation system
 * Implements reference-counted automatic memory management
 */

#include <stdatomic.h>

typedef struct {
    void *ptr;
    void (*deleter)(void*);
    atomic_int *ref_count;
} smart_ptr_t;

smart_ptr_t make_smart_ptr(void *ptr, void (*deleter)(void*)) {
    smart_ptr_t sp = {ptr, deleter, malloc(sizeof(atomic_int))};
    atomic_init(sp.ref_count, 1);
    return sp;
}

smart_ptr_t smart_ptr_copy(smart_ptr_t sp) {
    atomic_fetch_add(sp.ref_count, 1);
    return sp;
}

void smart_ptr_free(smart_ptr_t *sp) {
    if (atomic_fetch_sub(sp->ref_count, 1) == 1) {
        sp->deleter(sp->ptr);
        free(sp->ref_count);
    }
}

3. Memory Alignment

/**
 * Memory alignment utilities
 * Ensures proper alignment of data structures in memory
 */

// C11 alignment
_Alignas(16) char aligned_buffer[256];

// Manual alignment
#define ALIGN_UP(x, align) (((x) + (align) - 1) & ~((align) - 1))
#define IS_ALIGNED(x, align) (((x) & ((align) - 1)) == 0)

void* aligned_malloc(size_t size, size_t alignment) {
    void *ptr = malloc(size + alignment - 1 + sizeof(void*));
    if (!ptr) return NULL;
    
    void **aligned_ptr = (void**)(((uintptr_t)ptr + sizeof(void*) + alignment - 1) & ~(alignment - 1));
    aligned_ptr[-1] = ptr;
    return aligned_ptr;
}

Function Pointers and Callback Mechanisms

1. Object-Oriented Style Programming

/**
 * Virtual function table simulation
 * Implements object-oriented programming in C
 */

// Virtual function table simulation
typedef struct {
    void (*destroy)(void *self);
    void (*print)(void *self);
    int (*compare)(void *self, void *other);
} vtable_t;

typedef struct {
    vtable_t *vtable;
    // Concrete data
} object_t;

// Polymorphic invocation
#define CALL_METHOD(obj, method, ...) \
    ((obj)->vtable->method((obj), ##__VA_ARGS__))

2. State Machine Implementation

/**
 * State machine implementation
 * Provides flexible state management mechanism
 */

typedef enum {
    STATE_IDLE,
    STATE_RUNNING,
    STATE_PAUSED,
    STATE_STOPPED
} state_t;

typedef struct {
    state_t current_state;
    int (*handlers[4])(void *context, int event);
} state_machine_t;

int handle_idle(void *context, int event) {
    switch (event) {
        case EVENT_START:
            return STATE_RUNNING;
        default:
            return STATE_IDLE;
    }
}

3. Plugin System Design

/**
 * Plugin system design
 * Supports dynamic loading and functionality extension
 */

typedef struct {
    const char *name;
    int version;
    int (*init)(void);
    void (*cleanup)(void);
    void* (*create_instance)(void);
} plugin_interface_t;

// Dynamic plugin loading
#ifdef _WIN32
    #include <windows.h>
    #define LOAD_PLUGIN(name) LoadLibrary(name)
    #define GET_SYMBOL(handle, name) GetProcAddress(handle, name)
#else
    #include <dlfcn.h>
    #define LOAD_PLUGIN(name) dlopen(name, RTLD_LAZY)
    #define GET_SYMBOL(handle, name) dlsym(handle, name)
#endif

Data Structure Design

1. Linked List Implementation

/**
 * Doubly linked list implementation
 * Provides efficient insertion and deletion operations
 */

typedef struct list_node {
    void *data;
    struct list_node *next;
    struct list_node *prev;
} list_node_t;

typedef struct {
    list_node_t head;
    size_t size;
    void (*destructor)(void*);
} list_t;

// Doubly linked list operations
void list_insert_after(list_t *list, list_node_t *node, void *data) {
    list_node_t *new_node = malloc(sizeof(list_node_t));
    new_node->data = data;
    new_node->next = node->next;
    new_node->prev = node;
    
    if (node->next) node->next->prev = new_node;
    node->next = new_node;
    list->size++;
}

2. Hash Table Implementation

/**
 * Hash table implementation
 * Provides fast key-value storage and retrieval
 */

typedef struct hash_entry {
    char *key;
    void *value;
    struct hash_entry *next;
} hash_entry_t;

typedef struct {
    hash_entry_t **buckets;
    size_t bucket_count;
    size_t size;
    unsigned int (*hash_func)(const char*);
} hash_table_t;

unsigned int djb2_hash(const char *str) {
    unsigned int hash = 5381;
    int c;
    while ((c = *str++)) hash = ((hash << 5) + hash) + c;
    return hash;
}

3. Ring Buffer

/**
 * Ring buffer implementation
 * Suitable for producer-consumer patterns
 */

typedef struct {
    char *buffer;
    size_t size;
    size_t read_pos;
    size_t write_pos;
    int full;
} ring_buffer_t;

int ring_buffer_write(ring_buffer_t *rb, const char *data, size_t len) {
    size_t available = rb->size - ring_buffer_size(rb);
    if (len > available) return -1;
    
    for (size_t i = 0; i < len; i++) {
        rb->buffer[rb->write_pos] = data[i];
        rb->write_pos = (rb->write_pos + 1) % rb->size;
        if (rb->write_pos == rb->read_pos) rb->full = 1;
    }
    return len;
}

Concurrency and Multithreading

1. Thread-Safe Data Structures

/**
 * Thread-safe counter
 * Provides atomic operations and conditional waiting
 */

#include <pthread.h>

typedef struct {
    int value;
    pthread_mutex_t mutex;
    pthread_cond_t cond;
} thread_safe_counter_t;

void counter_increment(thread_safe_counter_t *counter) {
    pthread_mutex_lock(&counter->mutex);
    counter->value++;
    pthread_cond_signal(&counter->cond);
    pthread_mutex_unlock(&counter->mutex);
}

int counter_wait_for(thread_safe_counter_t *counter, int target) {
    pthread_mutex_lock(&counter->mutex);
    while (counter->value < target) {
        pthread_cond_wait(&counter->cond, &counter->mutex);
    }
    pthread_mutex_unlock(&counter->mutex);
    return counter->value;
}

2. Read-Write Lock Implementation

/**
 * Read-write lock implementation
 * Supports multiple readers/single writer concurrency control
 */

typedef struct {
    pthread_mutex_t mutex;
    pthread_cond_t read_cond;
    pthread_cond_t write_cond;
    int readers;
    int writers;
    int waiting_writers;
} rwlock_t;

void rwlock_rdlock(rwlock_t *rwlock) {
    pthread_mutex_lock(&rwlock->mutex);
    while (rwlock->writers > 0 || rwlock->waiting_writers > 0) {
        pthread_cond_wait(&rwlock->read_cond, &rwlock->mutex);
    }
    rwlock->readers++;
    pthread_mutex_unlock(&rwlock->mutex);
}

3. Lock-Free Programming

/**
 * Lock-free programming tools
 * Implements high-performance concurrency using atomic operations
 */

#include <stdatomic.h>

typedef struct {
    atomic_int value;
} atomic_counter_t;

void atomic_counter_increment(atomic_counter_t *counter) {
    atomic_fetch_add(&counter->value, 1);
}

int atomic_counter_get(atomic_counter_t *counter) {
    return atomic_load(&counter->value);
}

Error Handling and Exception Mechanisms

1. Error Code System

/**
 * Error code system
 * Provides structured error handling mechanism
 */

typedef enum {
    ERROR_SUCCESS = 0,
    ERROR_INVALID_PARAM = -1,
    ERROR_OUT_OF_MEMORY = -2,
    ERROR_FILE_NOT_FOUND = -3,
    ERROR_PERMISSION_DENIED = -4
} error_code_t;

#define RETURN_ON_ERROR(expr) do { \
    error_code_t err = (expr); \
    if (err != ERROR_SUCCESS) return err; \
} while(0)

// Context-aware error handling
typedef struct {
    error_code_t code;
    const char *message;
    const char *file;
    int line;
} error_context_t;

2. Exception Simulation Mechanism

/**
 * Exception simulation mechanism
 * Implements exception handling using setjmp/longjmp
 */

#include <setjmp.h>

typedef struct {
    jmp_buf jump_buffer;
    int error_code;
    const char *error_message;
} exception_context_t;

static __thread exception_context_t *current_exception = NULL;

#define TRY \
    do { \
        exception_context_t __exception_ctx; \
        __exception_ctx.error_code = 0; \
        if (setjmp(__exception_ctx.jump_buffer) == 0) { \
            current_exception = &__exception_ctx;

#define CATCH(error_var) \
        } else { \
            error_var = current_exception->error_code;

#define END_TRY \
        } \
        current_exception = NULL; \
    } while(0);

#define THROW(code, message) \
    do { \
        if (current_exception) { \
            current_exception->error_code = code; \
            current_exception->error_message = message; \
            longjmp(current_exception->jump_buffer, 1); \
        } \
    } while(0)

3. Resource Management (RAII)

/**
 * RAII resource management
 * Ensures automatic resource cleanup
 */

typedef struct {
    void *resource;
    void (*cleanup)(void*);
} raii_guard_t;

#define RAII_VAR(type, name, init, cleanup_func) \
    type name = init; \
    raii_guard_t __guard_##name = {&name, (void(*)(void*))cleanup_func}; \
    __attribute__((cleanup(raii_cleanup))) raii_guard_t *__raii_##name = &__guard_##name;

static void raii_cleanup(raii_guard_t **guard) {
    if ((*guard)->resource && (*guard)->cleanup) {
        (*guard)->cleanup((*guard)->resource);
    }
}

Performance Optimization Techniques

1. Cache-Friendly Data Structures

/**
 * Cache-friendly data structures
 * Optimizes memory layout for better cache hit rates
 */

// Struct packing optimization
struct __attribute__((packed)) packed_struct {
    char a;
    int b;
    short c;
};

// Cache line alignment
#define CACHE_LINE_SIZE 64
struct __attribute__((aligned(CACHE_LINE_SIZE))) cache_aligned_struct {
    int data[16];
};

2. Branch Prediction Optimization

/**
 * Branch prediction optimization
 * Uses compiler hints to improve execution efficiency
 */

// Static branch prediction
#define likely(x) __builtin_expect(!!(x), 1)
#define unlikely(x) __builtin_expect(!!(x), 0)

void optimized_function(int *array, size_t size) {
    if (unlikely(size == 0)) return;
    
    for (size_t i = 0; likely(i < size); i++) {
        process_element(array[i]);
    }
}

3. Inline Assembly Optimization

/**
 * Inline assembly optimization
 * Directly uses CPU instructions for performance gains
 */

// Read Time-Stamp Counter
static inline uint64_t rdtsc(void) {
    uint32_t lo, hi;
    __asm__ __volatile__("rdtsc" : "=a" (lo), "=d" (hi));
    return ((uint64_t)hi << 32) | lo;
}

// Memory barrier
#define MEMORY_BARRIER() __asm__ __volatile__("" ::: "memory")

4. SIMD Optimization

/**
 * SIMD optimization
 * Leverages vector instructions for parallel data processing
 */

#ifdef __SSE2__
#include <emmintrin.h>

void vector_add(float *a, float *b, float *result, size_t n) {
    size_t i = 0;
    for (; i + 4 <= n; i += 4) {
        __m128 va = _mm_load_ps(&a[i]);
        __m128 vb = _mm_load_ps(&b[i]);
        __m128 vr = _mm_add_ps(va, vb);
        _mm_store_ps(&result[i], vr);
    }
    // Process remaining elements
    for (; i < n; i++) {
        result[i] = a[i] + b[i];
    }
}
#endif

Debugging and Testing Techniques

1. Debug Macros

/**
 * Debugging utility macros
 * Provides convenient debugging and profiling capabilities
 */

#ifdef DEBUG
    #define DBG_PRINT(fmt, ...) \
        fprintf(stderr, "[DEBUG] %s:%d: " fmt "\n", __FILE__, __LINE__, ##__VA_ARGS__)
    #define ASSERT(condition) \
        do { \
            if (!(condition)) { \
                fprintf(stderr, "Assertion failed: %s at %s:%d\n", \
                        #condition, __FILE__, __LINE__); \
                abort(); \
            } \
        } while(0)
#else
    #define DBG_PRINT(fmt, ...) do {} while(0)
    #define ASSERT(condition) do {} while(0)
#endif

// Performance timing
#define TIME_IT(code, result_var) \
    do { \
        clock_t start = clock(); \
        code; \
        result_var = ((double)(clock() - start)) / CLOCKS_PER_SEC; \
    } while(0)

2. Unit Testing Framework

/**
 * Unit testing framework
 * Provides structured testing support
 */

typedef struct {
    const char *name;
    void (*test_func)(void);
    int passed;
    int failed;
} test_case_t;

#define TEST_CASE(name) \
    static void test_##name(void); \
    static test_case_t test_case_##name = {#name, test_##name, 0, 0}; \
    static void test_##name(void)

#define ASSERT_EQ(expected, actual) \
    do { \
        if ((expected) != (actual)) { \
            fprintf(stderr, "Assertion failed: %s != %s at %s:%d\n", \
                    #expected, #actual, __FILE__, __LINE__); \
            current_test->failed++; \
        } else { \
            current_test->passed++; \
        } \
    } while(0)

3. Memory Leak Detection

/**
 * Memory leak detection
 * Tracks memory allocations and deallocations
 */

#ifdef DEBUG_MEMORY
static size_t total_allocated = 0;
static size_t allocation_count = 0;

void* debug_malloc(size_t size, const char *file, int line) {
    void *ptr = malloc(size + sizeof(size_t));
    if (ptr) {
        *(size_t*)ptr = size;
        total_allocated += size;
        allocation_count++;
        printf("ALLOC: %zu bytes at %s:%d\n", size, file, line);
        return (char*)ptr + sizeof(size_t);
    }
    return NULL;
}

void debug_free(void *ptr, const char *file, int line) {
    if (ptr) {
        size_t *size_ptr = (size_t*)((char*)ptr - sizeof(size_t));
        total_allocated -= *size_ptr;
        allocation_count--;
        printf("FREE: %zu bytes at %s:%d\n", *size_ptr, file, line);
        free(size_ptr);
    }
}

#define malloc(size) debug_malloc(size, __FILE__, __LINE__)
#define free(ptr) debug_free(ptr, __FILE__, __LINE__)
#endif

Cross-Platform Programming

1. Platform Abstraction Layer

/**
 * Platform abstraction layer
 * Provides unified cross-platform interfaces
 */

// Thread abstraction
#ifdef _WIN32
    #include <windows.h>
    typedef HANDLE thread_t;
    typedef CRITICAL_SECTION mutex_t;
    #define THREAD_CREATE(thread, func, arg) \
        (thread = CreateThread(NULL, 0, (LPTHREAD_START_ROUTINE)func, arg, 0, NULL))
    #define THREAD_JOIN(thread) WaitForSingleObject(thread, INFINITE)
    #define MUTEX_INIT(mutex) InitializeCriticalSection(mutex)
    #define MUTEX_LOCK(mutex) EnterCriticalSection(mutex)
    #define MUTEX_UNLOCK(mutex) LeaveCriticalSection(mutex)
#else
    #include <pthread.h>
    typedef pthread_t thread_t;
    typedef pthread_mutex_t mutex_t;
    #define THREAD_CREATE(thread, func, arg) pthread_create(&thread, NULL, func, arg)
    #define THREAD_JOIN(thread) pthread_join(thread, NULL)
    #define MUTEX_INIT(mutex) pthread_mutex_init(mutex, NULL)
    #define MUTEX_LOCK(mutex) pthread_mutex_lock(mutex)
    #define MUTEX_UNLOCK(mutex) pthread_mutex_unlock(mutex)
#endif

2. File Path Handling

/**
 * File path handling
 * Provides cross-platform path operations
 */

#ifdef _WIN32
    #define PATH_SEPARATOR '\\'
    #define PATH_SEPARATOR_STR "\\"
#else
    #define PATH_SEPARATOR '/'
    #define PATH_SEPARATOR_STR "/"
#endif

char* join_path(const char *dir, const char *file) {
    size_t dir_len = strlen(dir);
    size_t file_len = strlen(file);
    char *result = malloc(dir_len + file_len + 2);
    
    strcpy(result, dir);
    if (dir[dir_len - 1] != PATH_SEPARATOR) {
        strcat(result, PATH_SEPARATOR_STR);
    }
    strcat(result, file);
    return result;
}

3. Endianness Handling

/**
 * Endianness handling
 * Ensures data correctness during network transmission
 */

// Network byte order conversion
#if defined(__BYTE_ORDER__) && __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__
    #define IS_BIG_ENDIAN 1
#else
    #define IS_BIG_ENDIAN 0
#endif

static inline uint32_t swap_endian_32(uint32_t val) {
    return ((val & 0x000000FF) << 24) |
           ((val & 0x0000FF00) << 8)  |
           ((val & 0x00FF0000) >> 8)  |
           ((val & 0xFF000000) >> 24);
}

#define hton32(x) (IS_BIG_ENDIAN ? (x) : swap_endian_32(x))
#define ntoh32(x) hton32(x)

Secure Programming Practices

1. Buffer Overflow Protection

/**
 * Buffer overflow protection
 * Provides safe string manipulation functions
 */

// Safe string operations
size_t safe_strncpy(char *dest, size_t dest_size, const char *src, size_t count) {
    if (dest_size == 0) return 0;
    
    size_t copy_len = (count < dest_size - 1) ? count : dest_size - 1;
    memcpy(dest, src, copy_len);
    dest[copy_len] = '\0';
    return copy_len;
}

// Formatted string safety checks
#define SAFE_PRINTF(buffer, size, format, ...) \
    do { \
        int __result = snprintf(buffer, size, format, ##__VA_ARGS__); \
        if (__result < 0 || (size_t)__result >= size) { \
            /* Handle overflow */ \
            buffer[size - 1] = '\0'; \
        } \
    } while(0)

2. Input Validation

/**
 * Input validation
 * Prevents security issues caused by malicious input
 */

// Integer overflow check
static inline int safe_add(int a, int b, int *result) {
    if ((b > 0 && a > INT_MAX - b) || (b < 0 && a < INT_MIN - b)) {
        return -1; // Overflow
    }
    *result = a + b;
    return 0;
}

// Pointer validation
#define VALIDATE_PTR(ptr) \
    do { \
        if (!(ptr)) { \
            return ERROR_INVALID_PARAM; \
        } \
    } while(0)

3. Secure Random Numbers

/**
 * Secure random number generation
 * Provides cryptographically secure random numbers
 */

#include <time.h>
#include <stdlib.h>

// Cryptographically secure random numbers (requires platform support)
#ifdef __linux__
    #include <sys/random.h>
    int secure_random_bytes(void *buf, size_t len) {
        return getrandom(buf, len, 0) == (ssize_t)len ? 0 : -1;
    }
#else
    // Simple pseudo-random number generator
    static unsigned long long rand_state = 1;
    
    void srand64(unsigned long long seed) {
        rand_state = seed;
    }
    
    unsigned long long rand64(void) {
        rand_state = rand_state * 6364136223846793005ULL + 1;
        return rand_state;
    }
#endif

Comprehensive Demonstration Examples

Event-Driven Architecture Demo

/**
 * Event-Driven Architecture - Event Type Definitions
 * Used for building flexible event handling systems
 */
typedef enum {
    EVENT_NONE = 0,
    EVENT_TIMER,
    EVENT_NETWORK,
    EVENT_USER,
    EVENT_SYSTEM
} event_type_t;

/**
 * Event structure
 * Contains event type, timestamp, and user data
 */
typedef struct {
    event_type_t type;
    uint64_t timestamp;
    void *data;
    size_t data_size;
} event_t;

/**
 * Event handler function pointer type
 * @param event Event pointer
 * @param context User context
 * @return Processing result
 */
typedef int (*event_handler_t)(event_t *event, void *context);

/**
 * Event listener structure
 * Stores event type and corresponding handler
 */
typedef struct {
    event_type_t type;
    event_handler_t handler;
    void *context;
    int priority;  // Processing priority
} event_listener_t;

/**
 * Event loop structure
 * Manages event queue and listeners
 */
typedef struct {
    event_listener_t *listeners;
    size_t listener_count;
    size_t listener_capacity;
    
    event_t *event_queue;
    size_t queue_head;
    size_t queue_tail;
    size_t queue_size;
    size_t queue_capacity;
    
    int running;
    pthread_mutex_t mutex;
    pthread_cond_t cond;
} event_loop_t;

// [Implementation code continues...]

Lock-Free Queue Implementation

/**
 * Lock-free queue implementation
 * Implements thread-safe lock-free queue using CAS operations
 */

#include <stdatomic.h>

/**
 * Queue node structure
 */
typedef struct queue_node {
    void *data;
    _Atomic(struct queue_node*) next;
} queue_node_t;

/**
 * Lock-free queue structure
 */
typedef struct {
    _Atomic(queue_node_t*) head;
    _Atomic(queue_node_t*) tail;
    atomic_size_t size;
} lockfree_queue_t;

// [Implementation code continues...]

Cache-Friendly Data Structures

/**
 * Cache-friendly data structure implementation
 * Optimizes memory layout to improve cache hit rates
 */

/**
 * Cache line size definition
 */
#define CACHE_LINE_SIZE 64

/**
 * Cache alignment macro
 */
#define CACHE_ALIGNED __attribute__((aligned(CACHE_LINE_SIZE)))

/**
 * SoA (Structure of Arrays) vector structure
 * Separates related data storage to improve cache efficiency
 */
typedef struct {
    float *x;           // X-coordinate array
    float *y;           // Y-coordinate array
    float *z;           // Z-coordinate array
    int *id;            // ID array
    size_t capacity;    // Capacity
    size_t size;        // Current size
    char padding[CACHE_LINE_SIZE - sizeof(size_t)*2 - sizeof(char*)*4]; // Padding to cache line boundary
} soa_vector_t;

// [Implementation code continues...]

Secure String Operations Library

/**
 * Secure string operations library
 * Provides buffer-overflow-safe string functions
 */

#include <string.h>
#include <stdio.h>
#include <stdarg.h>

/**
 * Secure string structure
 * Contains length information to prevent overflows
 */
typedef struct {
    char *data;
    size_t length;
    size_t capacity;
    int is_secure;  // Whether security checks are enabled
} secure_string_t;

// [Implementation code continues...]

Comprehensive Demo Function

/**
 * Event-Driven Architecture Demo Example
 */
void demo_event_driven_architecture() {
    printf("=== Event-Driven Architecture Demo ===\n");
    
    // Create event loop
    event_loop_t *loop = event_loop_create(100);
    if (!loop) {
        printf("Failed to create event loop\n");
        return;
    }
    
    // Add event listeners
    event_loop_add_listener(loop, EVENT_TIMER, timer_handler, NULL, 0);
    event_loop_add_listener(loop, EVENT_NETWORK, network_handler, NULL, 0);
    event_loop_add_listener(loop, EVENT_USER, user_handler, NULL, 0);
    
    // Start event loop thread
    pthread_t loop_thread;
    pthread_create(&loop_thread, NULL, (void*(*)(void*))event_loop_run, loop);
    
    // Post some test events
    for (int i = 0; i < 5; i++) {
        event_t event = {0};
        event.timestamp = time(NULL);
        
        // Post different event types
        switch (i % 3) {
            case 0:
                event.type = EVENT_TIMER;
                printf("Posting timer event %d\n", i);
                break;
            case 1: {
                event.type = EVENT_NETWORK;
                const char *msg = "Hello Network!";
                event.data = strdup(msg);
                event.data_size = strlen(msg);
                printf("Posting network event %d\n", i);
                break;
            }
            case 2:
                event.type = EVENT_USER;
                event.data = malloc(sizeof(int));
                *(int*)event.data = i;
                event.data_size = sizeof(int);
                printf("Posting user event %d\n", i);
                break;
        }
        
        event_loop_post(loop, &event);
        sleep(1);
    }
    
    // Cleanup event data
    sleep(2);
    
    // Stop and destroy event loop
    event_loop_stop(loop);
    pthread_join(loop_thread, NULL);
    event_loop_destroy(loop);
    
    printf("=== Demo Complete ===\n\n");
}

// [Other demo functions continue...]

Appendix: Best Practices Summary

Coding Standards

  1. ​Naming Conventions​​: Use clear names, avoid abbreviations
  2. ​Comment Style​​: Use Doxygen-style comments
  3. ​Error Handling​​: Always check return values
  4. ​Memory Management​​: Follow RAII principles
  5. ​Thread Safety​​: Clearly identify thread-safe functions

Performance Optimization Principles

  1. ​Measure Before Optimizing​​: Use profiling tools
  2. ​Algorithm First​​: Choose appropriate data structures and algorithms
  3. ​Avoid Premature Optimization​​: Maintain code readability
  4. ​Cache-Friendly​​: Consider data locality
  5. ​Compiler Optimization​​: Use compiler optimization flags appropriately

Secure Coding Principles

  1. ​Input Validation​​: Never trust external input
  2. ​Bounds Checking​​: Prevent buffer overflows
  3. ​Principle of Least Privilege​​: Use minimum necessary privileges
  4. ​Safe Functions​​: Use secure string functions
  5. ​Code Reviews​​: Conduct regular security code reviews

This comprehensive guide to advanced C programming techniques covers all important aspects from basic macros to complex concurrent programming, providing rich code examples and best practices to help developers write high-quality, high-performance, and secure C code.

发表在 linux文章 | 留下评论

C语言高级编程技巧与最佳实践 – 完整版

C语言高级编程技巧与最佳实践 – 完整版

目录

  1. 宏定义与预处理技巧
  2. 内存管理高级技巧
  3. 函数指针与回调机制
  4. 数据结构设计
  5. 并发与多线程
  6. 错误处理与异常机制
  7. 性能优化技巧
  8. 调试与测试技巧
  9. 跨平台编程
  10. 安全编程实践
  11. 综合演示示例

宏定义与预处理技巧

1. 条件编译与平台检测

/**
 * 平台和编译器检测
 * 用于条件编译和跨平台兼容性
 */
#if defined(_WIN32) || defined(_WIN64)
    #define PLATFORM_WINDOWS
#elif defined(__linux__)
    #define PLATFORM_LINUX
#elif defined(__APPLE__)
    #define PLATFORM_MACOS
#endif

// 编译器检测
#if defined(__GNUC__)
    #define COMPILER_GCC
#elif defined(_MSC_VER)
    #define COMPILER_MSVC
#endif

// 版本检测
#if __STDC_VERSION__ >= 201112L
    #define C11_SUPPORTED
#endif

2. 强大的宏技巧

/**
 * 高级宏定义集合
 * 提供类型安全和便捷的宏工具
 */

// 字符串化和连接
#define STRINGIFY(x) #x
#define TOSTRING(x) STRINGIFY(x)
#define CONCAT(a, b) a##b

// 获取数组长度
#define ARRAY_SIZE(arr) (sizeof(arr) / sizeof((arr)[0]))

// 容器of宏(从成员指针获取容器指针)
#define container_of(ptr, type, member) ({          \
    void *__mptr = (void *)(ptr);                    \
    ((type *)(__mptr - offsetof(type, member))); })

// 最大最小值
#define MAX(a, b) ((a) > (b) ? (a) : (b))
#define MIN(a, b) ((a) < (b) ? (a) : (b))

// 交换变量(不使用临时变量)
#define SWAP(a, b) do { typeof(a) temp = a; a = b; b = temp; } while(0)

// 编译时断言
#define STATIC_ASSERT(condition, message) \
    typedef char static_assertion_##message[(condition) ? 1 : -1]

// 可变参数宏
#define DEBUG_PRINT(fmt, ...) \
    fprintf(stderr, "[DEBUG] %s:%d: " fmt "\n", __FILE__, __LINE__, ##__VA_ARGS__)

3. 现代C语言特性

/**
 * C11现代特性使用
 * 利用新标准提高代码质量
 */

// 泛型选择
#define generic_max(a, b) _Generic((a), \
    int: max_int, \
    float: max_float, \
    double: max_double \
)(a, b)

// 静态断言(C11)
_Static_assert(sizeof(int) >= 4, "int must be at least 4 bytes");

// 线程局部存储(C11)
_Thread_local int thread_var;

内存管理高级技巧

1. 内存池设计

/**
 * 内存池实现
 * 提供高效的内存分配和回收机制
 */
typedef struct {
    void *memory;
    size_t size;
    size_t used;
    size_t block_size;
} memory_pool_t;

memory_pool_t* create_pool(size_t size, size_t block_size) {
    memory_pool_t *pool = malloc(sizeof(memory_pool_t));
    pool->memory = malloc(size);
    pool->size = size;
    pool->used = 0;
    pool->block_size = block_size;
    return pool;
}

void* pool_alloc(memory_pool_t *pool, size_t size) {
    if (pool->used + size > pool->size) return NULL;
    void *ptr = (char*)pool->memory + pool->used;
    pool->used += size;
    return ptr;
}

2. 智能指针模拟

/**
 * 智能指针模拟系统
 * 实现引用计数自动内存管理
 */

#include <stdatomic.h>

typedef struct {
    void *ptr;
    void (*deleter)(void*);
    atomic_int *ref_count;
} smart_ptr_t;

smart_ptr_t make_smart_ptr(void *ptr, void (*deleter)(void*)) {
    smart_ptr_t sp = {ptr, deleter, malloc(sizeof(atomic_int))};
    atomic_init(sp.ref_count, 1);
    return sp;
}

smart_ptr_t smart_ptr_copy(smart_ptr_t sp) {
    atomic_fetch_add(sp.ref_count, 1);
    return sp;
}

void smart_ptr_free(smart_ptr_t *sp) {
    if (atomic_fetch_sub(sp->ref_count, 1) == 1) {
        sp->deleter(sp->ptr);
        free(sp->ref_count);
    }
}

3. 内存对齐

/**
 * 内存对齐工具
 * 确保数据结构在内存中的正确对齐
 */

// C11对齐
_Alignas(16) char aligned_buffer[256];

// 手动对齐
#define ALIGN_UP(x, align) (((x) + (align) - 1) & ~((align) - 1))
#define IS_ALIGNED(x, align) (((x) & ((align) - 1)) == 0)

void* aligned_malloc(size_t size, size_t alignment) {
    void *ptr = malloc(size + alignment - 1 + sizeof(void*));
    if (!ptr) return NULL;
    
    void **aligned_ptr = (void**)(((uintptr_t)ptr + sizeof(void*) + alignment - 1) & ~(alignment - 1));
    aligned_ptr[-1] = ptr;
    return aligned_ptr;
}

函数指针与回调机制

1. 面向对象风格编程

/**
 * 虚函数表模拟
 * 实现C语言中的面向对象编程
 */

// 虚函数表模拟
typedef struct {
    void (*destroy)(void *self);
    void (*print)(void *self);
    int (*compare)(void *self, void *other);
} vtable_t;

typedef struct {
    vtable_t *vtable;
    // 具体数据
} object_t;

// 多态调用
#define CALL_METHOD(obj, method, ...) \
    ((obj)->vtable->method((obj), ##__VA_ARGS__))

2. 状态机实现

/**
 * 状态机实现
 * 提供灵活的状态管理机制
 */

typedef enum {
    STATE_IDLE,
    STATE_RUNNING,
    STATE_PAUSED,
    STATE_STOPPED
} state_t;

typedef struct {
    state_t current_state;
    int (*handlers[4])(void *context, int event);
} state_machine_t;

int handle_idle(void *context, int event) {
    switch (event) {
        case EVENT_START:
            return STATE_RUNNING;
        default:
            return STATE_IDLE;
    }
}

3. 插件系统设计

/**
 * 插件系统设计
 * 支持动态加载和扩展功能
 */

typedef struct {
    const char *name;
    int version;
    int (*init)(void);
    void (*cleanup)(void);
    void* (*create_instance)(void);
} plugin_interface_t;

// 动态加载插件
#ifdef _WIN32
    #include <windows.h>
    #define LOAD_PLUGIN(name) LoadLibrary(name)
    #define GET_SYMBOL(handle, name) GetProcAddress(handle, name)
#else
    #include <dlfcn.h>
    #define LOAD_PLUGIN(name) dlopen(name, RTLD_LAZY)
    #define GET_SYMBOL(handle, name) dlsym(handle, name)
#endif

数据结构设计

1. 链表实现

/**
 * 双向链表实现
 * 提供高效的插入和删除操作
 */

typedef struct list_node {
    void *data;
    struct list_node *next;
    struct list_node *prev;
} list_node_t;

typedef struct {
    list_node_t head;
    size_t size;
    void (*destructor)(void*);
} list_t;

// 双向链表操作
void list_insert_after(list_t *list, list_node_t *node, void *data) {
    list_node_t *new_node = malloc(sizeof(list_node_t));
    new_node->data = data;
    new_node->next = node->next;
    new_node->prev = node;
    
    if (node->next) node->next->prev = new_node;
    node->next = new_node;
    list->size++;
}

2. 哈希表实现

/**
 * 哈希表实现
 * 提供快速的键值对存储和检索
 */

typedef struct hash_entry {
    char *key;
    void *value;
    struct hash_entry *next;
} hash_entry_t;

typedef struct {
    hash_entry_t **buckets;
    size_t bucket_count;
    size_t size;
    unsigned int (*hash_func)(const char*);
} hash_table_t;

unsigned int djb2_hash(const char *str) {
    unsigned int hash = 5381;
    int c;
    while ((c = *str++)) hash = ((hash << 5) + hash) + c;
    return hash;
}

3. 环形缓冲区

/**
 * 环形缓冲区实现
 * 适用于生产者-消费者模式
 */

typedef struct {
    char *buffer;
    size_t size;
    size_t read_pos;
    size_t write_pos;
    int full;
} ring_buffer_t;

int ring_buffer_write(ring_buffer_t *rb, const char *data, size_t len) {
    size_t available = rb->size - ring_buffer_size(rb);
    if (len > available) return -1;
    
    for (size_t i = 0; i < len; i++) {
        rb->buffer[rb->write_pos] = data[i];
        rb->write_pos = (rb->write_pos + 1) % rb->size;
        if (rb->write_pos == rb->read_pos) rb->full = 1;
    }
    return len;
}

并发与多线程

1. 线程安全的数据结构

/**
 * 线程安全计数器
 * 提供原子操作和条件等待
 */

#include <pthread.h>

typedef struct {
    int value;
    pthread_mutex_t mutex;
    pthread_cond_t cond;
} thread_safe_counter_t;

void counter_increment(thread_safe_counter_t *counter) {
    pthread_mutex_lock(&counter->mutex);
    counter->value++;
    pthread_cond_signal(&counter->cond);
    pthread_mutex_unlock(&counter->mutex);
}

int counter_wait_for(thread_safe_counter_t *counter, int target) {
    pthread_mutex_lock(&counter->mutex);
    while (counter->value < target) {
        pthread_cond_wait(&counter->cond, &counter->mutex);
    }
    pthread_mutex_unlock(&counter->mutex);
    return counter->value;
}

2. 读写锁实现

/**
 * 读写锁实现
 * 支持多读单写的并发控制
 */

typedef struct {
    pthread_mutex_t mutex;
    pthread_cond_t read_cond;
    pthread_cond_t write_cond;
    int readers;
    int writers;
    int waiting_writers;
} rwlock_t;

void rwlock_rdlock(rwlock_t *rwlock) {
    pthread_mutex_lock(&rwlock->mutex);
    while (rwlock->writers > 0 || rwlock->waiting_writers > 0) {
        pthread_cond_wait(&rwlock->read_cond, &rwlock->mutex);
    }
    rwlock->readers++;
    pthread_mutex_unlock(&rwlock->mutex);
}

3. 无锁编程

/**
 * 无锁编程工具
 * 使用原子操作实现高性能并发
 */

#include <stdatomic.h>

typedef struct {
    atomic_int value;
} atomic_counter_t;

void atomic_counter_increment(atomic_counter_t *counter) {
    atomic_fetch_add(&counter->value, 1);
}

int atomic_counter_get(atomic_counter_t *counter) {
    return atomic_load(&counter->value);
}

错误处理与异常机制

1. 错误码系统

/**
 * 错误码系统
 * 提供结构化的错误处理机制
 */

typedef enum {
    ERROR_SUCCESS = 0,
    ERROR_INVALID_PARAM = -1,
    ERROR_OUT_OF_MEMORY = -2,
    ERROR_FILE_NOT_FOUND = -3,
    ERROR_PERMISSION_DENIED = -4
} error_code_t;

#define RETURN_ON_ERROR(expr) do { \
    error_code_t err = (expr); \
    if (err != ERROR_SUCCESS) return err; \
} while(0)

// 带上下文的错误处理
typedef struct {
    error_code_t code;
    const char *message;
    const char *file;
    int line;
} error_context_t;

2. 异常模拟机制

/**
 * 异常模拟机制
 * 使用setjmp/longjmp实现异常处理
 */

#include <setjmp.h>

typedef struct {
    jmp_buf jump_buffer;
    int error_code;
    const char *error_message;
} exception_context_t;

static __thread exception_context_t *current_exception = NULL;

#define TRY \
    do { \
        exception_context_t __exception_ctx; \
        __exception_ctx.error_code = 0; \
        if (setjmp(__exception_ctx.jump_buffer) == 0) { \
            current_exception = &__exception_ctx;

#define CATCH(error_var) \
        } else { \
            error_var = current_exception->error_code;

#define END_TRY \
        } \
        current_exception = NULL; \
    } while(0);

#define THROW(code, message) \
    do { \
        if (current_exception) { \
            current_exception->error_code = code; \
            current_exception->error_message = message; \
            longjmp(current_exception->jump_buffer, 1); \
        } \
    } while(0)

3. 资源管理RAII

/**
 * RAII资源管理
 * 确保资源的自动释放
 */

typedef struct {
    void *resource;
    void (*cleanup)(void*);
} raii_guard_t;

#define RAII_VAR(type, name, init, cleanup_func) \
    type name = init; \
    raii_guard_t __guard_##name = {&name, (void(*)(void*))cleanup_func}; \
    __attribute__((cleanup(raii_cleanup))) raii_guard_t *__raii_##name = &__guard_##name;

static void raii_cleanup(raii_guard_t **guard) {
    if ((*guard)->resource && (*guard)->cleanup) {
        (*guard)->cleanup((*guard)->resource);
    }
}

性能优化技巧

1. 缓存友好的数据结构

/**
 * 缓存友好的数据结构
 * 优化内存布局提高缓存命中率
 */

// 结构体打包优化
struct __attribute__((packed)) packed_struct {
    char a;
    int b;
    short c;
};

// 缓存行对齐
#define CACHE_LINE_SIZE 64
struct __attribute__((aligned(CACHE_LINE_SIZE))) cache_aligned_struct {
    int data[16];
};

2. 分支预测优化

/**
 * 分支预测优化
 * 使用编译器提示提高执行效率
 */

// 静态分支预测
#define likely(x) __builtin_expect(!!(x), 1)
#define unlikely(x) __builtin_expect(!!(x), 0)

void optimized_function(int *array, size_t size) {
    if (unlikely(size == 0)) return;
    
    for (size_t i = 0; likely(i < size); i++) {
        process_element(array[i]);
    }
}

3. 内联汇编优化

/**
 * 内联汇编优化
 * 直接使用CPU指令提高性能
 */

// 获取时间戳计数器
static inline uint64_t rdtsc(void) {
    uint32_t lo, hi;
    __asm__ __volatile__("rdtsc" : "=a" (lo), "=d" (hi));
    return ((uint64_t)hi << 32) | lo;
}

// 内存屏障
#define MEMORY_BARRIER() __asm__ __volatile__("" ::: "memory")

4. SIMD优化

/**
 * SIMD优化
 * 利用向量指令并行处理数据
 */

#ifdef __SSE2__
#include <emmintrin.h>

void vector_add(float *a, float *b, float *result, size_t n) {
    size_t i = 0;
    for (; i + 4 <= n; i += 4) {
        __m128 va = _mm_load_ps(&a[i]);
        __m128 vb = _mm_load_ps(&b[i]);
        __m128 vr = _mm_add_ps(va, vb);
        _mm_store_ps(&result[i], vr);
    }
    // 处理剩余元素
    for (; i < n; i++) {
        result[i] = a[i] + b[i];
    }
}
#endif

调试与测试技巧

1. 调试宏

/**
 * 调试工具宏
 * 提供便捷的调试和性能分析功能
 */

#ifdef DEBUG
    #define DBG_PRINT(fmt, ...) \
        fprintf(stderr, "[DEBUG] %s:%d: " fmt "\n", __FILE__, __LINE__, ##__VA_ARGS__)
    #define ASSERT(condition) \
        do { \
            if (!(condition)) { \
                fprintf(stderr, "Assertion failed: %s at %s:%d\n", \
                        #condition, __FILE__, __LINE__); \
                abort(); \
            } \
        } while(0)
#else
    #define DBG_PRINT(fmt, ...) do {} while(0)
    #define ASSERT(condition) do {} while(0)
#endif

// 性能计时
#define TIME_IT(code, result_var) \
    do { \
        clock_t start = clock(); \
        code; \
        result_var = ((double)(clock() - start)) / CLOCKS_PER_SEC; \
    } while(0)

2. 单元测试框架

/**
 * 单元测试框架
 * 提供结构化的测试支持
 */

typedef struct {
    const char *name;
    void (*test_func)(void);
    int passed;
    int failed;
} test_case_t;

#define TEST_CASE(name) \
    static void test_##name(void); \
    static test_case_t test_case_##name = {#name, test_##name, 0, 0}; \
    static void test_##name(void)

#define ASSERT_EQ(expected, actual) \
    do { \
        if ((expected) != (actual)) { \
            fprintf(stderr, "Assertion failed: %s != %s at %s:%d\n", \
                    #expected, #actual, __FILE__, __LINE__); \
            current_test->failed++; \
        } else { \
            current_test->passed++; \
        } \
    } while(0)

3. 内存泄漏检测

/**
 * 内存泄漏检测
 * 跟踪内存分配和释放
 */

#ifdef DEBUG_MEMORY
static size_t total_allocated = 0;
static size_t allocation_count = 0;

void* debug_malloc(size_t size, const char *file, int line) {
    void *ptr = malloc(size + sizeof(size_t));
    if (ptr) {
        *(size_t*)ptr = size;
        total_allocated += size;
        allocation_count++;
        printf("ALLOC: %zu bytes at %s:%d\n", size, file, line);
        return (char*)ptr + sizeof(size_t);
    }
    return NULL;
}

void debug_free(void *ptr, const char *file, int line) {
    if (ptr) {
        size_t *size_ptr = (size_t*)((char*)ptr - sizeof(size_t));
        total_allocated -= *size_ptr;
        allocation_count--;
        printf("FREE: %zu bytes at %s:%d\n", *size_ptr, file, line);
        free(size_ptr);
    }
}

#define malloc(size) debug_malloc(size, __FILE__, __LINE__)
#define free(ptr) debug_free(ptr, __FILE__, __LINE__)
#endif

跨平台编程

1. 平台抽象层

/**
 * 平台抽象层
 * 提供统一的跨平台接口
 */

// 线程抽象
#ifdef _WIN32
    #include <windows.h>
    typedef HANDLE thread_t;
    typedef CRITICAL_SECTION mutex_t;
    #define THREAD_CREATE(thread, func, arg) \
        (thread = CreateThread(NULL, 0, (LPTHREAD_START_ROUTINE)func, arg, 0, NULL))
    #define THREAD_JOIN(thread) WaitForSingleObject(thread, INFINITE)
    #define MUTEX_INIT(mutex) InitializeCriticalSection(mutex)
    #define MUTEX_LOCK(mutex) EnterCriticalSection(mutex)
    #define MUTEX_UNLOCK(mutex) LeaveCriticalSection(mutex)
#else
    #include <pthread.h>
    typedef pthread_t thread_t;
    typedef pthread_mutex_t mutex_t;
    #define THREAD_CREATE(thread, func, arg) pthread_create(&thread, NULL, func, arg)
    #define THREAD_JOIN(thread) pthread_join(thread, NULL)
    #define MUTEX_INIT(mutex) pthread_mutex_init(mutex, NULL)
    #define MUTEX_LOCK(mutex) pthread_mutex_lock(mutex)
    #define MUTEX_UNLOCK(mutex) pthread_mutex_unlock(mutex)
#endif

2. 文件路径处理

/**
 * 文件路径处理
 * 提供跨平台的路径操作
 */

#ifdef _WIN32
    #define PATH_SEPARATOR '\\'
    #define PATH_SEPARATOR_STR "\\"
#else
    #define PATH_SEPARATOR '/'
    #define PATH_SEPARATOR_STR "/"
#endif

char* join_path(const char *dir, const char *file) {
    size_t dir_len = strlen(dir);
    size_t file_len = strlen(file);
    char *result = malloc(dir_len + file_len + 2);
    
    strcpy(result, dir);
    if (dir[dir_len - 1] != PATH_SEPARATOR) {
        strcat(result, PATH_SEPARATOR_STR);
    }
    strcat(result, file);
    return result;
}

3. 字节序处理

/**
 * 字节序处理
 * 确保数据在网络传输中的正确性
 */

// 网络字节序转换
#if defined(__BYTE_ORDER__) && __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__
    #define IS_BIG_ENDIAN 1
#else
    #define IS_BIG_ENDIAN 0
#endif

static inline uint32_t swap_endian_32(uint32_t val) {
    return ((val & 0x000000FF) << 24) |
           ((val & 0x0000FF00) << 8)  |
           ((val & 0x00FF0000) >> 8)  |
           ((val & 0xFF000000) >> 24);
}

#define hton32(x) (IS_BIG_ENDIAN ? (x) : swap_endian_32(x))
#define ntoh32(x) hton32(x)

安全编程实践

1. 缓冲区溢出防护

/**
 * 缓冲区溢出防护
 * 提供安全的字符串操作函数
 */

// 安全字符串操作
size_t safe_strncpy(char *dest, size_t dest_size, const char *src, size_t count) {
    if (dest_size == 0) return 0;
    
    size_t copy_len = (count < dest_size - 1) ? count : dest_size - 1;
    memcpy(dest, src, copy_len);
    dest[copy_len] = '\0';
    return copy_len;
}

// 格式化字符串安全检查
#define SAFE_PRINTF(buffer, size, format, ...) \
    do { \
        int __result = snprintf(buffer, size, format, ##__VA_ARGS__); \
        if (__result < 0 || (size_t)__result >= size) { \
            /* 处理溢出 */ \
            buffer[size - 1] = '\0'; \
        } \
    } while(0)

2. 输入验证

/**
 * 输入验证
 * 防止恶意输入导致的安全问题
 */

// 整数溢出检查
static inline int safe_add(int a, int b, int *result) {
    if ((b > 0 && a > INT_MAX - b) || (b < 0 && a < INT_MIN - b)) {
        return -1; // 溢出
    }
    *result = a + b;
    return 0;
}

// 指针验证
#define VALIDATE_PTR(ptr) \
    do { \
        if (!(ptr)) { \
            return ERROR_INVALID_PARAM; \
        } \
    } while(0)

3. 安全随机数

/**
 * 安全随机数生成
 * 提供密码学安全的随机数
 */

#include <time.h>
#include <stdlib.h>

// 密码学安全随机数(需要平台支持)
#ifdef __linux__
    #include <sys/random.h>
    int secure_random_bytes(void *buf, size_t len) {
        return getrandom(buf, len, 0) == (ssize_t)len ? 0 : -1;
    }
#else
    // 简单的伪随机数生成器
    static unsigned long long rand_state = 1;
    
    void srand64(unsigned long long seed) {
        rand_state = seed;
    }
    
    unsigned long long rand64(void) {
        rand_state = rand_state * 6364136223846793005ULL + 1;
        return rand_state;
    }
#endif

综合演示示例

事件驱动架构演示

/**
 * 事件驱动架构 - 事件类型定义
 * 用于构建灵活的事件处理系统
 */
typedef enum {
    EVENT_NONE = 0,
    EVENT_TIMER,
    EVENT_NETWORK,
    EVENT_USER,
    EVENT_SYSTEM
} event_type_t;

/**
 * 事件结构体
 * 包含事件类型、时间戳和用户数据
 */
typedef struct {
    event_type_t type;
    uint64_t timestamp;
    void *data;
    size_t data_size;
} event_t;

/**
 * 事件处理器函数指针类型
 * @param event 事件指针
 * @param context 用户上下文
 * @return 处理结果
 */
typedef int (*event_handler_t)(event_t *event, void *context);

/**
 * 事件监听器结构体
 * 存储事件类型和对应的处理器
 */
typedef struct {
    event_type_t type;
    event_handler_t handler;
    void *context;
    int priority;  // 处理优先级
} event_listener_t;

/**
 * 事件循环结构体
 * 管理事件队列和监听器
 */
typedef struct {
    event_listener_t *listeners;
    size_t listener_count;
    size_t listener_capacity;
    
    event_t *event_queue;
    size_t queue_head;
    size_t queue_tail;
    size_t queue_size;
    size_t queue_capacity;
    
    int running;
    pthread_mutex_t mutex;
    pthread_cond_t cond;
} event_loop_t;

/**
 * 创建事件循环
 * @param queue_capacity 事件队列容量
 * @return 事件循环指针
 */
event_loop_t* event_loop_create(size_t queue_capacity) {
    event_loop_t *loop = calloc(1, sizeof(event_loop_t));
    if (!loop) return NULL;
    
    loop->queue_capacity = queue_capacity;
    loop->event_queue = calloc(queue_capacity, sizeof(event_t));
    if (!loop->event_queue) {
        free(loop);
        return NULL;
    }
    
    loop->listeners = calloc(16, sizeof(event_listener_t));  // 初始监听器容量
    loop->listener_capacity = 16;
    
    pthread_mutex_init(&loop->mutex, NULL);
    pthread_cond_init(&loop->cond, NULL);
    
    return loop;
}

/**
 * 添加事件监听器
 * @param loop 事件循环
 * @param type 事件类型
 * @param handler 事件处理器
 * @param context 用户上下文
 * @param priority 处理优先级
 * @return 0成功,-1失败
 */
int event_loop_add_listener(event_loop_t *loop, event_type_t type,
                           event_handler_t handler, void *context, int priority) {
    if (!loop || !handler) return -1;
    
    pthread_mutex_lock(&loop->mutex);
    
    // 扩展监听器数组
    if (loop->listener_count >= loop->listener_capacity) {
        size_t new_capacity = loop->listener_capacity * 2;
        event_listener_t *new_listeners = realloc(loop->listeners, 
                                                 new_capacity * sizeof(event_listener_t));
        if (!new_listeners) {
            pthread_mutex_unlock(&loop->mutex);
            return -1;
        }
        loop->listeners = new_listeners;
        loop->listener_capacity = new_capacity;
    }
    
    // 添加新监听器
    event_listener_t *listener = &loop->listeners[loop->listener_count++];
    listener->type = type;
    listener->handler = handler;
    listener->context = context;
    listener->priority = priority;
    
    pthread_mutex_unlock(&loop->mutex);
    return 0;
}

/**
 * 发布事件
 * @param loop 事件循环
 * @param event 事件指针
 * @return 0成功,-1失败
 */
int event_loop_post(event_loop_t *loop, event_t *event) {
    if (!loop || !event) return -1;
    
    pthread_mutex_lock(&loop->mutex);
    
    // 检查队列是否已满
    if ((loop->queue_tail + 1) % loop->queue_capacity == loop->queue_head) {
        pthread_mutex_unlock(&loop->mutex);
        return -1;  // 队列已满
    }
    
    // 复制事件到队列
    event_t *queue_event = &loop->event_queue[loop->queue_tail];
    queue_event->type = event->type;
    queue_event->timestamp = event->timestamp;
    
    if (event->data && event->data_size > 0) {
        queue_event->data = malloc(event->data_size);
        if (queue_event->data) {
            memcpy(queue_event->data, event->data, event->data_size);
            queue_event->data_size = event->data_size;
        } else {
            queue_event->data_size = 0;
        }
    } else {
        queue_event->data = NULL;
        queue_event->data_size = 0;
    }
    
    loop->queue_tail = (loop->queue_tail + 1) % loop->queue_capacity;
    
    pthread_cond_signal(&loop->cond);
    pthread_mutex_unlock(&loop->mutex);
    
    return 0;
}

/**
 * 事件循环主函数
 * @param loop 事件循环
 */
void event_loop_run(event_loop_t *loop) {
    if (!loop) return;
    
    loop->running = 1;
    
    while (loop->running) {
        pthread_mutex_lock(&loop->mutex);
        
        // 等待事件
        while (loop->queue_head == loop->queue_tail && loop->running) {
            pthread_cond_wait(&loop->cond, &loop->mutex);
        }
        
        if (!loop->running) {
            pthread_mutex_unlock(&loop->mutex);
            break;
        }
        
        // 获取事件
        event_t event = loop->event_queue[loop->queue_head];
        loop->queue_head = (loop->queue_head + 1) % loop->queue_capacity;
        
        pthread_mutex_unlock(&loop->mutex);
        
        // 处理事件
        for (size_t i = 0; i < loop->listener_count; i++) {
            if (loop->listeners[i].type == event.type || 
                loop->listeners[i].type == EVENT_NONE) {  // EVENT_NONE监听所有事件
                loop->listeners[i].handler(&event, loop->listeners[i].context);
            }
        }
        
        // 清理事件数据
        if (event.data) {
            free(event.data);
        }
    }
}

/**
 * 停止事件循环
 * @param loop 事件循环
 */
void event_loop_stop(event_loop_t *loop) {
    if (!loop) return;
    
    pthread_mutex_lock(&loop->mutex);
    loop->running = 0;
    pthread_cond_signal(&loop->cond);
    pthread_mutex_unlock(&loop->mutex);
}

/**
 * 销毁事件循环
 * @param loop 事件循环
 */
void event_loop_destroy(event_loop_t *loop) {
    if (!loop) return;
    
    event_loop_stop(loop);
    
    if (loop->listeners) {
        free(loop->listeners);
    }
    
    // 清理事件队列中剩余的事件
    while (loop->queue_head != loop->queue_tail) {
        event_t *event = &loop->event_queue[loop->queue_head];
        if (event->data) {
            free(event->data);
        }
        loop->queue_head = (loop->queue_head + 1) % loop->queue_capacity;
    }
    
    if (loop->event_queue) {
        free(loop->event_queue);
    }
    
    pthread_mutex_destroy(&loop->mutex);
    pthread_cond_destroy(&loop->cond);
    
    free(loop);
}

无锁队列实现

/**
 * 无锁队列实现
 * 使用CAS操作实现线程安全的无锁队列
 */

#include <stdatomic.h>

/**
 * 队列节点结构
 */
typedef struct queue_node {
    void *data;
    _Atomic(struct queue_node*) next;
} queue_node_t;

/**
 * 无锁队列结构
 */
typedef struct {
    _Atomic(queue_node_t*) head;
    _Atomic(queue_node_t*) tail;
    atomic_size_t size;
} lockfree_queue_t;

/**
 * 创建无锁队列节点
 * @param data 节点数据
 * @return 节点指针
 */
static queue_node_t* create_queue_node(void *data) {
    queue_node_t *node = malloc(sizeof(queue_node_t));
    if (node) {
        node->data = data;
        atomic_init(&node->next, NULL);
    }
    return node;
}

/**
 * 创建无锁队列
 * @return 队列指针
 */
lockfree_queue_t* lockfree_queue_create() {
    lockfree_queue_t *queue = malloc(sizeof(lockfree_queue_t));
    if (!queue) return NULL;
    
    // 创建哨兵节点
    queue_node_t *dummy = create_queue_node(NULL);
    if (!dummy) {
        free(queue);
        return NULL;
    }
    
    atomic_init(&queue->head, dummy);
    atomic_init(&queue->tail, dummy);
    atomic_init(&queue->size, 0);
    
    return queue;
}

/**
 * 入队操作
 * @param queue 队列
 * @param data 数据
 * @return 0成功,-1失败
 */
int lockfree_queue_enqueue(lockfree_queue_t *queue, void *data) {
    if (!queue) return -1;
    
    queue_node_t *node = create_queue_node(data);
    if (!node) return -1;
    
    queue_node_t *prev_tail = NULL;
    queue_node_t *prev_tail_next = NULL;
    
    while (1) {
        prev_tail = atomic_load(&queue->tail);
        prev_tail_next = atomic_load(&prev_tail->next);
        
        // 检查tail是否一致
        if (prev_tail == atomic_load(&queue->tail)) {
            if (prev_tail_next == NULL) {
                // tail是最后一个节点,尝试链接新节点
                if (atomic_compare_exchange_weak(&prev_tail->next, &prev_tail_next, node)) {
                    break;  // 成功
                }
            } else {
                // tail不是最后一个节点,尝试推进tail
                atomic_compare_exchange_weak(&queue->tail, &prev_tail, prev_tail_next);
            }
        }
    }
    
    // 推进tail
    atomic_compare_exchange_weak(&queue->tail, &prev_tail, node);
    atomic_fetch_add(&queue->size, 1);
    
    return 0;
}

/**
 * 出队操作
 * @param queue 队列
 * @param data 输出参数:出队数据
 * @return 0成功,-1队列为空
 */
int lockfree_queue_dequeue(lockfree_queue_t *queue, void **data) {
    if (!queue || !data) return -1;
    
    queue_node_t *head = NULL;
    queue_node_t *tail = NULL;
    queue_node_t *next = NULL;
    
    while (1) {
        head = atomic_load(&queue->head);
        tail = atomic_load(&queue->tail);
        next = atomic_load(&head->next);
        
        // 检查head是否一致
        if (head == atomic_load(&queue->head)) {
            if (head == tail) {
                // 队列为空或只有一个哨兵节点
                if (next == NULL) {
                    *data = NULL;
                    return -1;  // 队列为空
                }
                // 队列正在变化,推进tail
                atomic_compare_exchange_weak(&queue->tail, &tail, next);
            } else {
                // 读取数据
                *data = next->data;
                // 尝试推进head
                if (atomic_compare_exchange_weak(&queue->head, &head, next)) {
                    atomic_fetch_sub(&queue->size, 1);
                    break;
                }
            }
        }
    }
    
    free(head);  // 释放旧的head节点
    return 0;
}

/**
 * 获取队列大小
 * @param queue 队列
 * @return 队列大小
 */
size_t lockfree_queue_size(lockfree_queue_t *queue) {
    return atomic_load(&queue->size);
}

/**
 * 销毁无锁队列
 * @param queue 队列
 */
void lockfree_queue_destroy(lockfree_queue_t *queue) {
    if (!queue) return;
    
    // 清空队列
    void *data;
    while (lockfree_queue_dequeue(queue, &data) == 0) {
        // 数据由调用者负责释放
    }
    
    // 释放哨兵节点
    queue_node_t *head = atomic_load(&queue->head);
    if (head) {
        free(head);
    }
    
    free(queue);
}

缓存友好的数据结构

/**
 * 缓存友好的数据结构实现
 * 优化内存布局以提高缓存命中率
 */

/**
 * 缓存行大小定义
 */
#define CACHE_LINE_SIZE 64

/**
 * 缓存对齐宏
 */
#define CACHE_ALIGNED __attribute__((aligned(CACHE_LINE_SIZE)))

/**
 * SoA (Structure of Arrays) 向量结构
 * 将相关数据分离存储以提高缓存效率
 */
typedef struct {
    float *x;           // X坐标数组
    float *y;           // Y坐标数组
    float *z;           // Z坐标数组
    int *id;            // ID数组
    size_t capacity;    // 容量
    size_t size;        // 当前大小
    char padding[CACHE_LINE_SIZE - sizeof(size_t)*2 - sizeof(char*)*4]; // 填充到缓存行边界
} soa_vector_t;

/**
 * AoS (Array of Structures) 向量结构
 * 传统结构体数组方式
 */
typedef struct {
    float x, y, z;
    int id;
} aos_point_t;

typedef struct {
    aos_point_t *points;
    size_t capacity;
    size_t size;
    char padding[CACHE_LINE_SIZE - sizeof(size_t)*2 - sizeof(void*)]; // 填充
} aos_vector_t;

/**
 * 创建SoA向量
 * @param initial_capacity 初始容量
 * @return SoA向量指针
 */
soa_vector_t* soa_vector_create(size_t initial_capacity) {
    soa_vector_t *vec = calloc(1, sizeof(soa_vector_t));
    if (!vec) return NULL;
    
    vec->capacity = initial_capacity;
    vec->x = malloc(sizeof(float) * initial_capacity);
    vec->y = malloc(sizeof(float) * initial_capacity);
    vec->z = malloc(sizeof(float) * initial_capacity);
    vec->id = malloc(sizeof(int) * initial_capacity);
    
    if (!vec->x || !vec->y || !vec->z || !vec->id) {
        soa_vector_destroy(vec);
        return NULL;
    }
    
    return vec;
}

/**
 * 创建AoS向量
 * @param initial_capacity 初始容量
 * @return AoS向量指针
 */
aos_vector_t* aos_vector_create(size_t initial_capacity) {
    aos_vector_t *vec = calloc(1, sizeof(aos_vector_t));
    if (!vec) return NULL;
    
    vec->capacity = initial_capacity;
    vec->points = malloc(sizeof(aos_point_t) * initial_capacity);
    if (!vec->points) {
        free(vec);
        return NULL;
    }
    
    return vec;
}

/**
 * SoA向量添加元素
 * @param vec SoA向量
 * @param x X坐标
 * @param y Y坐标
 * @param z Z坐标
 * @param id ID
 * @return 0成功,-1失败
 */
int soa_vector_push(soa_vector_t *vec, float x, float y, float z, int id) {
    if (!vec) return -1;
    
    // 检查是否需要扩容
    if (vec->size >= vec->capacity) {
        size_t new_capacity = vec->capacity * 2;
        
        float *new_x = realloc(vec->x, sizeof(float) * new_capacity);
        float *new_y = realloc(vec->y, sizeof(float) * new_capacity);
        float *new_z = realloc(vec->z, sizeof(float) * new_capacity);
        int *new_id = realloc(vec->id, sizeof(int) * new_capacity);
        
        if (!new_x || !new_y || !new_z || !new_id) {
            return -1;
        }
        
        vec->x = new_x;
        vec->y = new_y;
        vec->z = new_z;
        vec->id = new_id;
        vec->capacity = new_capacity;
    }
    
    size_t index = vec->size++;
    vec->x[index] = x;
    vec->y[index] = y;
    vec->z[index] = z;
    vec->id[index] = id;
    
    return 0;
}

/**
 * AoS向量添加元素
 * @param vec AoS向量
 * @param x X坐标
 * @param y Y坐标
 * @param z Z坐标
 * @param id ID
 * @return 0成功,-1失败
 */
int aos_vector_push(aos_vector_t *vec, float x, float y, float z, int id) {
    if (!vec) return -1;
    
    // 检查是否需要扩容
    if (vec->size >= vec->capacity) {
        size_t new_capacity = vec->capacity * 2;
        aos_point_t *new_points = realloc(vec->points, sizeof(aos_point_t) * new_capacity);
        if (!new_points) return -1;
        
        vec->points = new_points;
        vec->capacity = new_capacity;
    }
    
    aos_point_t *point = &vec->points[vec->size++];
    point->x = x;
    point->y = y;
    point->z = z;
    point->id = id;
    
    return 0;
}

/**
 * SoA向量批量处理(缓存友好)
 * @param vec SoA向量
 * @param processor 处理函数
 * @param context 用户上下文
 */
void soa_vector_process(soa_vector_t *vec, 
                       void (*processor)(float x, float y, float z, int id, void *context),
                       void *context) {
    if (!vec || !processor) return;
    
    // 分别处理每个数组,提高缓存命中率
    for (size_t i = 0; i < vec->size; i++) {
        processor(vec->x[i], vec->y[i], vec->z[i], vec->id[i], context);
    }
}

/**
 * AoS向量批量处理
 * @param vec AoS向量
 * @param processor 处理函数
 * @param context 用户上下文
 */
void aos_vector_process(aos_vector_t *vec,
                       void (*processor)(float x, float y, float z, int id, void *context),
                       void *context) {
    if (!vec || !processor) return;
    
    // 处理结构体数组
    for (size_t i = 0; i < vec->size; i++) {
        aos_point_t *point = &vec->points[i];
        processor(point->x, point->y, point->z, point->id, context);
    }
}

/**
 * 销毁SoA向量
 * @param vec SoA向量
 */
void soa_vector_destroy(soa_vector_t *vec) {
    if (!vec) return;
    
    if (vec->x) free(vec->x);
    if (vec->y) free(vec->y);
    if (vec->z) free(vec->z);
    if (vec->id) free(vec->id);
    free(vec);
}

/**
 * 销毁AoS向量
 * @param vec AoS向量
 */
void aos_vector_destroy(aos_vector_t *vec) {
    if (!vec) return;
    
    if (vec->points) free(vec->points);
    free(vec);
}

/**
 * 性能测试结构
 */
typedef struct {
    double soa_time;
    double aos_time;
    size_t elements_processed;
} performance_result_t;

安全字符串操作库

/**
 * 安全字符串操作库
 * 提供防止缓冲区溢出的安全字符串函数
 */

#include <string.h>
#include <stdio.h>
#include <stdarg.h>

/**
 * 安全字符串结构
 * 包含长度信息以防止溢出
 */
typedef struct {
    char *data;
    size_t length;
    size_t capacity;
    int is_secure;  // 是否启用安全检查
} secure_string_t;

/**
 * 创建安全字符串
 * @param initial_capacity 初始容量
 * @param enable_security 是否启用安全检查
 * @return 安全字符串指针
 */
secure_string_t* secure_string_create(size_t initial_capacity, int enable_security) {
    secure_string_t *str = calloc(1, sizeof(secure_string_t));
    if (!str) return NULL;
    
    str->data = malloc(initial_capacity + 1);  // +1 for null terminator
    if (!str->data) {
        free(str);
        return NULL;
    }
    
    str->data[0] = '\0';
    str->capacity = initial_capacity;
    str->is_secure = enable_security;
    
    return str;
}

/**
 * 从C字符串创建安全字符串
 * @param c_str C字符串
 * @param enable_security 是否启用安全检查
 * @return 安全字符串指针
 */
secure_string_t* secure_string_from_cstr(const char *c_str, int enable_security) {
    if (!c_str) return NULL;
    
    size_t len = strlen(c_str);
    secure_string_t *str = secure_string_create(len, enable_security);
    if (str) {
        strncpy(str->data, c_str, str->capacity);
        str->data[str->capacity] = '\0';
        str->length = strlen(str->data);
    }
    
    return str;
}

/**
 * 安全字符串追加
 * @param str 目标字符串
 * @param append_str 要追加的字符串
 * @return 0成功,-1失败
 */
int secure_string_append(secure_string_t *str, const char *append_str) {
    if (!str || !append_str) return -1;
    
    size_t append_len = strlen(append_str);
    size_t new_length = str->length + append_len;
    
    // 检查是否需要扩容
    if (new_length >= str->capacity) {
        if (str->is_secure) {
            // 安全模式:拒绝超出容量的操作
            return -1;
        } else {
            // 非安全模式:自动扩容
            size_t new_capacity = (new_length + 1) * 2;
            char *new_data = realloc(str->data, new_capacity + 1);
            if (!new_data) return -1;
            
            str->data = new_data;
            str->capacity = new_capacity;
        }
    }
    
    // 执行追加
    strncat(str->data, append_str, str->capacity - str->length);
    str->length = strlen(str->data);
    
    return 0;
}

/**
 * 安全格式化字符串
 * @param str 目标字符串
 * @param format 格式字符串
 * @param ... 可变参数
 * @return 写入的字符数,-1失败
 */
int secure_string_printf(secure_string_t *str, const char *format, ...) {
    if (!str || !format) return -1;
    
    va_list args;
    va_start(args, format);
    
    // 首先计算需要的空间
    va_list args_copy;
    va_copy(args_copy, args);
    int needed = vsnprintf(NULL, 0, format, args_copy);
    va_end(args_copy);
    
    if (needed < 0) {
        va_end(args);
        return -1;
    }
    
    // 检查容量
    if ((size_t)needed >= str->capacity - str->length) {
        if (str->is_secure) {
            va_end(args);
            return -1;  // 容量不足
        } else {
            // 自动扩容
            size_t new_capacity = str->length + needed + 1;
            char *new_data = realloc(str->data, new_capacity + 1);
            if (!new_data) {
                va_end(args);
                return -1;
            }
            
            str->data = new_data;
            str->capacity = new_capacity;
        }
    }
    
    // 执行格式化
    int written = vsnprintf(str->data + str->length, 
                           str->capacity - str->length, 
                           format, args);
    
    if (written >= 0) {
        str->length += written;
    }
    
    va_end(args);
    return written;
}

/**
 * 安全字符串比较
 * @param str1 字符串1
 * @param str2 字符串2
 * @return 比较结果
 */
int secure_string_compare(const secure_string_t *str1, const secure_string_t *str2) {
    if (!str1 && !str2) return 0;
    if (!str1) return -1;
    if (!str2) return 1;
    
    return strcmp(str1->data, str2->data);
}

/**
 * 获取C字符串
 * @param str 安全字符串
 * @return C字符串指针
 */
const char* secure_string_cstr(const secure_string_t *str) {
    return str ? str->data : NULL;
}

/**
 * 获取字符串长度
 * @param str 安全字符串
 * @return 字符串长度
 */
size_t secure_string_length(const secure_string_t *str) {
    return str ? str->length : 0;
}

/**
 * 清空字符串
 * @param str 安全字符串
 */
void secure_string_clear(secure_string_t *str) {
    if (str && str->data) {
        str->data[0] = '\0';
        str->length = 0;
    }
}

/**
 * 销毁安全字符串
 * @param str 安全字符串
 */
void secure_string_destroy(secure_string_t *str) {
    if (!str) return;
    
    if (str->data) {
        // 安全清除内存
        memset(str->data, 0, str->capacity);
        free(str->data);
    }
    
    free(str);
}

/**
 * 安全字符串池
 * 管理多个安全字符串以提高性能
 */
typedef struct {
    secure_string_t **strings;
    size_t count;
    size_t capacity;
    pthread_mutex_t mutex;
} string_pool_t;

/**
 * 创建字符串池
 * @param initial_capacity 初始容量
 * @return 字符串池指针
 */
string_pool_t* string_pool_create(size_t initial_capacity) {
    string_pool_t *pool = calloc(1, sizeof(string_pool_t));
    if (!pool) return NULL;
    
    pool->strings = calloc(initial_capacity, sizeof(secure_string_t*));
    if (!pool->strings) {
        free(pool);
        return NULL;
    }
    
    pool->capacity = initial_capacity;
    pthread_mutex_init(&pool->mutex, NULL);
    
    return pool;
}

综合演示函数

/**
 * 事件驱动架构演示示例
 */
void demo_event_driven_architecture() {
    printf("=== 事件驱动架构演示 ===\n");
    
    // 创建事件循环
    event_loop_t *loop = event_loop_create(100);
    if (!loop) {
        printf("Failed to create event loop\n");
        return;
    }
    
    // 添加事件监听器
    event_loop_add_listener(loop, EVENT_TIMER, timer_handler, NULL, 0);
    event_loop_add_listener(loop, EVENT_NETWORK, network_handler, NULL, 0);
    event_loop_add_listener(loop, EVENT_USER, user_handler, NULL, 0);
    
    // 启动事件循环线程
    pthread_t loop_thread;
    pthread_create(&loop_thread, NULL, (void*(*)(void*))event_loop_run, loop);
    
    // 发布一些测试事件
    for (int i = 0; i < 5; i++) {
        event_t event = {0};
        event.timestamp = time(NULL);
        
        // 发布不同类型的事件
        switch (i % 3) {
            case 0:
                event.type = EVENT_TIMER;
                printf("Posting timer event %d\n", i);
                break;
            case 1: {
                event.type = EVENT_NETWORK;
                const char *msg = "Hello Network!";
                event.data = strdup(msg);
                event.data_size = strlen(msg);
                printf("Posting network event %d\n", i);
                break;
            }
            case 2:
                event.type = EVENT_USER;
                event.data = malloc(sizeof(int));
                *(int*)event.data = i;
                event.data_size = sizeof(int);
                printf("Posting user event %d\n", i);
                break;
        }
        
        event_loop_post(loop, &event);
        sleep(1);
    }
    
    // 清理事件数据
    sleep(2);
    
    // 停止并销毁事件循环
    event_loop_stop(loop);
    pthread_join(loop_thread, NULL);
    event_loop_destroy(loop);
    
    printf("=== 演示完成 ===\n\n");
}

/**
 * 无锁队列演示示例
 */
void demo_lockfree_queue() {
    printf("=== 无锁队列演示 ===\n");
    
    // 初始化
    queue = lockfree_queue_create();
    atomic_init(&items_produced, 0);
    atomic_init(&items_consumed, 0);
    
    if (!queue) {
        printf("Failed to create queue\n");
        return;
    }
    
    // 创建生产者和消费者线程
    pthread_t producers[NUM_PRODUCERS];
    pthread_t consumers[NUM_CONSUMERS];
    int producer_ids[NUM_PRODUCERS];
    int consumer_ids[NUM_CONSUMERS];
    
    // 启动生产者线程
    for (int i = 0; i < NUM_PRODUCERS; i++) {
        producer_ids[i] = i;
        pthread_create(&producers[i], NULL, producer_thread, &producer_ids[i]);
    }
    
    // 启动消费者线程
    for (int i = 0; i < NUM_CONSUMERS; i++) {
        consumer_ids[i] = i;
        pthread_create(&consumers[i], NULL, consumer_thread, &consumer_ids[i]);
    }
    
    // 等待所有线程完成
    for (int i = 0; i < NUM_PRODUCERS; i++) {
        pthread_join(producers[i], NULL);
    }
    
    for (int i = 0; i < NUM_CONSUMERS; i++) {
        pthread_join(consumers[i], NULL);
    }
    
    // 显示结果
    printf("Total produced: %d\n", atomic_load(&items_produced));
    printf("Total consumed: %d\n", atomic_load(&items_consumed));
    printf("Queue size: %zu\n", lockfree_queue_size(queue));
    
    // 清理
    lockfree_queue_destroy(queue);
    
    printf("=== 演示完成 ===\n\n");
}

/**
 * 缓存友好数据结构演示示例
 */
void demo_cache_friendly_structures() {
    printf("=== 缓存友好数据结构演示 ===\n");
    
    // 测试不同规模的数据
    size_t test_sizes[] = {1000, 10000, 100000, 1000000};
    int num_tests = sizeof(test_sizes) / sizeof(test_sizes[0]);
    
    printf("%-10s %-12s %-12s %-10s\n", "Elements", "SoA Time(s)", "AoS Time(s)", "Speedup");
    printf("------------------------------------------------\n");
    
    for (int i = 0; i < num_tests; i++) {
        performance_result_t result = test_performance(test_sizes[i]);
        
        double speedup = result.aos_time / result.soa_time;
        printf("%-10zu %-12.6f %-12.6f %-10.2fx\n", 
               result.elements_processed,
               result.soa_time,
               result.aos_time,
               speedup);
    }
    
    printf("=== 演示完成 ===\n\n");
}

/**
 * 安全字符串操作演示示例
 */
void demo_secure_strings() {
    printf("=== 安全字符串操作演示 ===\n");
    
    // 创建安全字符串(启用安全检查)
    secure_string_t *str1 = secure_string_create(20, 1);  // 安全模式
    secure_string_t *str2 = secure_string_from_cstr("Hello", 1);
    
    if (!str1 || !str2) {
        printf("Failed to create secure strings\n");
        return;
    }
    
    printf("Initial strings:\n");
    printf("str1: '%s' (length: %zu)\n", secure_string_cstr(str1), secure_string_length(str1));
    printf("str2: '%s' (length: %zu)\n", secure_string_cstr(str2), secure_string_length(str2));
    
    // 安全追加
    if (secure_string_append(str2, " World!") == 0) {
        printf("After append: '%s'\n", secure_string_cstr(str2));
    } else {
        printf("Append failed (security check)\n");
    }
    
    // 安全格式化
    if (secure_string_printf(str1, "Number: %d, String: %s", 42, "test") >= 0) {
        printf("Formatted string: '%s'\n", secure_string_cstr(str1));
    } else {
        printf("Format failed (security check)\n");
    }
    
    // 尝试超出容量的操作(在安全模式下会失败)
    printf("\nTesting security checks:\n");
    if (secure_string_append(str1, "This is a very long string that exceeds capacity") == -1) {
        printf("Security check prevented buffer overflow!\n");
    }
    
    // 字符串比较
    secure_string_t *str3 = secure_string_from_cstr("Hello World!", 1);
    printf("Comparison result: %d\n", secure_string_compare(str2, str3));
    
    // 清理
    secure_string_destroy(str1);
    secure_string_destroy(str2);
    secure_string_destroy(str3);
    
    printf("=== 演示完成 ===\n\n");
}

// 综合演示函数
void run_all_demos() {
    printf("C语言高级编程技巧演示\n");
    printf("=====================\n\n");
    
    // 运行所有演示
    demo_event_driven_architecture();
    demo_lockfree_queue();
    demo_cache_friendly_structures();
    demo_secure_strings();
    
    printf("所有演示完成!\n");
}

附录:最佳实践总结

编码规范

  1. 命名约定:使用清晰的命名,避免缩写
  2. 注释风格:使用Doxygen风格注释
  3. 错误处理:始终检查返回值
  4. 内存管理:遵循RAII原则
  5. 线程安全:明确标识线程安全函数

性能优化原则

  1. 先测量后优化:使用性能分析工具
  2. 算法优先:选择合适的数据结构和算法
  3. 避免过早优化:保持代码可读性
  4. 缓存友好:考虑数据局部性
  5. 编译器优化:合理使用编译器优化选项

安全编码原则

  1. 输入验证:永远不要信任外部输入
  2. 边界检查:防止缓冲区溢出
  3. 最小权限:使用最小必要权限
  4. 安全函数:使用安全的字符串函数
  5. 代码审查:定期进行安全代码审查

这份完整的C语言高级编程技巧指南涵盖了从基础宏定义到复杂并发编程的所有重要方面,提供了丰富的代码示例和最佳实践,帮助开发者编写高质量、高性能、安全的C代码。

发表在 linux文章 | 留下评论

Python二进制文件编码探测工具

背景
实现基于python语言cchardet库的二进制文件分析程序,按照预设分段参数对文件进行读取和cchardet的文本编码探测。脚本具备跳过文件头n字节,按照m字节分段二进制文件及分段后数据连续4字节探测功能。
结果输出会展示每段的序号,偏移起始,片内置信度识别偏移字节,片大小,编码方式,置信度,高置信度提示信息字段;

如何使用脚本:

# 1. 基本用法:分析整个文件
python encoding_detector.py myfile.bin

# 2. 指定块大小
python encoding_detector.py -s 512 myfile.bin

# 3. 跳过每个块的前 10 个字节
python encoding_detector.py -s 100 -h 10 myfile.bin

# 4. 从文件偏移 1116 开始分析
python encoding_detector.py -s 100 -o 1116 ../ftp-pcap/ftp-utf8-long.pcap

# 5. 结合使用:从偏移 1000 开始,每块 256 字节,跳过每块前 20 字节
python encoding_detector.py -s 256 -h 20 -o 1000 myfile.bin

# 6. 通过管道输入
cat myfile.bin | python encoding_detector.py -s 512

Python脚本是实现

#!/usr/bin/env python3
import cchardet
import sys
import os

def print_hex(data, width=16):
    """以十六进制和ASCII形式打印字节数据"""
    for i in range(0, len(data), width):
        # 十六进制部分
        hex_part = ' '.join(f'{byte:02x}' for byte in data[i:i+width])
        # ASCII部分 (可打印字符或'.')
        ascii_part = ''.join(chr(byte) if 32 <= byte <= 126 else '.' for byte in data[i:i+width])
        # 打印地址偏移、十六进制和ASCII
        print(f'{i:08x}: {hex_part:<{width*3}} |{ascii_part}|')

def detect_chunks_from_file(filename, chunk_size=1024, from_head_bytes=0, from_file_offset=0):
    """
    将文件按指定大小切块,并对每个块进行编码检测。
    如果检测置信度为0,则尝试偏移1-4字节重新检测。
    from_file_offset: 从文件的哪个字节偏移开始读取。
    """
    if not os.path.exists(filename):
        print(f"Error: File '{filename}' does not exist.", file=sys.stderr)
        return

    try:
        file_size = os.path.getsize(filename)
        print(f"Analyzing file: {filename} (Total size: {file_size} bytes)")
        print(f"Chunk size: {chunk_size} bytes")
        if from_head_bytes > 0:
            print(f"Skipping first {from_head_bytes} bytes of each chunk for detection.")
        if from_file_offset > 0:
            print(f"Starting analysis from file offset: {from_file_offset}")
        print("-" * 50)

        with open(filename, 'rb') as f:
            # 定位到文件的起始偏移
            if from_file_offset > 0:
                f.seek(from_file_offset)
            
            chunk_number = 0
            while True:
                chunk_data = f.read(chunk_size)
                if not chunk_
                    break

                # 计算当前块在原始文件中的基础偏移量
                offset = from_file_offset + chunk_number * chunk_size

                # 裁剪用于检测的数据(跳过头部字节)
                detection_data = chunk_data[from_head_bytes:] if len(chunk_data) > from_head_bytes else b''

                # --- 初始检测 ---
                encoding = None
                confidence = 0.0
                offset_by_used = 0 # 记录最终使用的偏移量

                if len(detection_data) > 0:
                    try:
                        result = cchardet.detect(detection_data)
                        if isinstance(result, dict):
                            encoding = result.get('encoding')
                            temp_confidence = result.get('confidence')
                            if temp_confidence is None:
                                confidence = 0.0
                            else:
                                confidence = temp_confidence
                            
                            if encoding is not None and not isinstance(encoding, str):
                                print(f"Warning: Unexpected encoding type in chunk {chunk_number}: {type(encoding)}", file=sys.stderr)
                                encoding = str(encoding) if encoding is not None else None
                        else:
                            print(f"Warning: cchardet returned unexpected type in chunk {chunk_number}: {type(result)}", file=sys.stderr)
                    except Exception as e:
                        print(f"Warning: cchardet failed on chunk {chunk_number}: {e}", file=sys.stderr)
                        encoding = "Error"
                        confidence = 0.0

                # --- 偏移优化逻辑 ---
                max_offset_attempts = 4
                if confidence == 0.0 and len(detection_data) > max_offset_attempts:
                    for offset_by in range(1, max_offset_attempts + 1):
                        if len(detection_data) > offset_by:
                            adjusted_detection_data = detection_data[offset_by:]
                            if len(adjusted_detection_data) > 0:
                                try:
                                    adjusted_result = cchardet.detect(adjusted_detection_data)
                                    if isinstance(adjusted_result, dict):
                                        adjusted_confidence = adjusted_result.get('confidence')
                                        if adjusted_confidence is None:
                                            adjusted_confidence = 0.0
                                        
                                        if adjusted_confidence > confidence:
                                            encoding = adjusted_result.get('encoding')
                                            confidence = adjusted_confidence
                                            offset_by_used = offset_by # 记录使用的偏移量
                                            
                                            if confidence > 0.0:
                                                break
                                except Exception:
                                    pass
                        else:
                            break

                # --- 格式化输出 ---
                encoding_display = encoding if encoding is not None else "N/A"
                output_line = (f"Chunk {chunk_number:4d} | Offset {offset:8d} | "
                               f"offset_by {offset_by_used:2d} | "
                               f"Size {len(chunk_data):4d} | "
                               f"Encoding: {encoding_display:>12} | "
                               f"Confidence: {confidence:6.4f}")
                
                # 可以根据置信度调整输出格式,例如高亮高置信度结果
                if confidence >= 0.75:
                     print(output_line) # 或用不同颜色/符号标记,这里简化为普通打印
                else:
                     print(output_line)

                # 如果置信度为0,可以选择打印数据内容(当前被注释掉)
                # if confidence == 0.0 and len(chunk_data) > 0:
                #     print ("\n")
                #     print_hex(chunk_data)
                #     print ("\n")
                    
                chunk_number += 1

            # 文件读取结束后的检查
            # f.tell() 在 seek 后返回的是绝对位置
            absolute_tell = f.tell()
            if absolute_tell < file_size:
                 print(f"Warning: Stopped reading before end of file '{filename}'. "
                      f"Read up to file offset {absolute_tell} bytes out of {file_size} bytes.", file=sys.stderr)

    except IOError as e:
        print(f"Error reading file '{filename}': {e}", file=sys.stderr)
    except Exception as e:
        print(f"An unexpected error occurred while processing '{filename}': {e}", file=sys.stderr)
    
    print("-" * 50 + f" Analysis of '{filename}' finished. " + "-" * 10 + "\n")


def detect_chunks_from_bytes(data, source_name="Byte Input", chunk_size=1024, from_head_bytes=0):
    """
    将字节数据按指定大小切块,并对每个块进行编码检测。
    如果检测置信度为0,则尝试偏移1-3字节重新检测。
    """
    data_len = len(data)
    print(f"Analyzing data from: {source_name} (Total size: {data_len} bytes)")
    print(f"Chunk size: {chunk_size} bytes")
    if from_head_bytes > 0:
        print(f"Skipping first {from_head_bytes} bytes of each chunk for detection.")
    print("-" * 50)

    if data_len == 0:
        print("Input data is empty.")
        return

    chunk_number = 0
    for i in range(0, data_len, chunk_size):
        chunk_data = data[i:i + chunk_size]
        if not chunk_
            break

        offset = i
        detection_data = chunk_data[from_head_bytes:] if len(chunk_data) > from_head_bytes else b''

        encoding = None
        confidence = 0.0
        
        if len(detection_data) > 0:
            try:
                result = cchardet.detect(detection_data)
                if isinstance(result, dict):
                    encoding = result.get('encoding')
                    temp_confidence = result.get('confidence')
                    if temp_confidence is None:
                        confidence = 0.0
                    else:
                        confidence = temp_confidence
                    
                    if encoding is not None and not isinstance(encoding, str):
                        print(f"Warning: Unexpected encoding type in chunk {chunk_number}: {type(encoding)}", file=sys.stderr)
                        encoding = str(encoding) if encoding is not None else None
                else:
                    print(f"Warning: cchardet returned unexpected type in chunk {chunk_number}: {type(result)}", file=sys.stderr)
            except Exception as e:
                print(f"Warning: cchardet failed on chunk {chunk_number}: {e}", file=sys.stderr)
                encoding = "Error"
                confidence = 0.0

        # --- 偏移优化逻辑 (针对 bytes 输入)---
        max_offset_attempts = 3
        offset_by_used = 0
        if confidence == 0.0 and len(detection_data) > max_offset_attempts:
            for offset_by in range(1, max_offset_attempts + 1):
                if len(detection_data) > offset_by:
                    adjusted_detection_data = detection_data[offset_by:]
                    if len(adjusted_detection_data) > 0:
                        try:
                            adjusted_result = cchardet.detect(adjusted_detection_data)
                            if isinstance(adjusted_result, dict):
                                adjusted_confidence = adjusted_result.get('confidence')
                                if adjusted_confidence is None:
                                    adjusted_confidence = 0.0
                                
                                if adjusted_confidence > confidence:
                                    encoding = adjusted_result.get('encoding')
                                    confidence = adjusted_confidence
                                    offset_by_used = offset_by
                                    
                                    if confidence > 0.0:
                                        break
                        except Exception:
                            pass
                else:
                    break

        # 格式化输出 (bytes 输入也显示 offset_by)
        encoding_display = encoding if encoding is not None else "N/A"
        print(f"Chunk {chunk_number:4d} | Offset {offset:8d} | "
              f"offset_by {offset_by_used:2d} | " # 添加 offset_by 显示
              f"Size {len(chunk_data):4d} | "
              f"Encoding: {encoding_display:>12} | "
              f"Confidence: {confidence:6.4f}")

        # 如果置信度为0,打印数据内容
        # if confidence == 0.0 and len(chunk_data) > 0:
        #     print ("\n")
        #     print_hex(chunk_data)
        #     print ("\n")

        chunk_number += 1

    print("-" * 50 + f" Analysis of '{source_name}' finished. " + "-" * 10 + "\n")


def main():
    """
    主函数,处理命令行参数并调用相应的检测函数。
    """
    if len(sys.argv) < 2:
        print("No filename provided. Reading binary data from STDIN...", file=sys.stderr)
        try:
            data = sys.stdin.buffer.read()
            detect_chunks_from_bytes(data, source_name="STDIN", chunk_size=1024)
        except KeyboardInterrupt:
            print("\nInterrupted by user.", file=sys.stderr)
        except Exception as e:
            print(f"Error reading from STDIN: {e}", file=sys.stderr)
        sys.exit(0)

    # 默认参数
    chunk_size = 1024
    from_head_bytes = 0
    from_file_offset = 0 # 新增默认参数
    filenames = []

    # 解析命令行参数
    i = 1
    while i < len(sys.argv):
        if sys.argv[i] == '-s':
            if i + 1 < len(sys.argv):
                try:
                    chunk_size = int(sys.argv[i + 1])
                    if chunk_size <= 0:
                        raise ValueError("Chunk size must be positive.")
                    i += 2
                except ValueError as e:
                    print(f"Error: Invalid chunk size '-s {sys.argv[i + 1]}': {e}", file=sys.stderr)
                    sys.exit(1)
            else:
                print("Error: Option '-s' requires an argument.", file=sys.stderr)
                sys.exit(1)
        elif sys.argv[i] == '-h':
             if i + 1 < len(sys.argv):
                try:
                    from_head_bytes = int(sys.argv[i + 1])
                    if from_head_bytes < 0:
                        raise ValueError("Head bytes to skip must be non-negative.")
                    i += 2
                except ValueError as e:
                    print(f"Error: Invalid head bytes '-h {sys.argv[i + 1]}': {e}", file=sys.stderr)
                    sys.exit(1)
             else:
                print("Error: Option '-h' requires an argument.", file=sys.stderr)
                sys.exit(1)
        # --- 新增:解析 -o 参数 ---
        elif sys.argv[i] == '-o':
             if i + 1 < len(sys.argv):
                try:
                    from_file_offset = int(sys.argv[i + 1])
                    if from_file_offset < 0:
                        raise ValueError("File offset must be non-negative.")
                    i += 2
                except ValueError as e:
                    print(f"Error: Invalid file offset '-o {sys.argv[i + 1]}': {e}", file=sys.stderr)
                    sys.exit(1)
             else:
                print("Error: Option '-o' requires an argument.", file=sys.stderr)
                sys.exit(1)
        # --- 新增结束 ---
        else:
            filenames.append(sys.argv[i])
            i += 1

    if not filenames:
        print("Error: No filename provided.", file=sys.stderr)
        sys.exit(1)

    # 对每个提供的文件进行处理
    for filename in filenames:
        # --- 修改:传递 from_file_offset 参数 ---
        detect_chunks_from_file(filename, chunk_size, from_head_bytes, from_file_offset)


if __name__ == "__main__":
    main()
发表在 linux文章 | 留下评论

​​Text Encoding Design: A Complex and Historically Rich Process​​

​Text Encoding Design: A Complex and Historically Rich Process​

The design of text encoding is a complex and historically rich process aimed at representing the characters of the world’s diverse languages using limited digital units (typically 8-bit bytes).

​Core Design Principles​
Text encoding design revolves around the following key goals:

  • ​Expressiveness​​: Capable of representing all characters in a target language or character set.
  • ​Compatibility​​: Maximizing compatibility with existing standards, especially ASCII.
  • ​Efficiency​​: Optimizing storage, transmission, and processing speed.
  • ​Standardization​​: Requiring widely accepted and implemented standards.

​Major Encoding Types and Their Byte Designs​

  1. ​Single-Byte Character Sets (SBCS)​
    • ​Design​​: Each character uses one byte (8 bits).
    • ​Capacity​​: One byte offers 256 possible values (2⁸). Values 0x00–0x7F are typically reserved for ASCII (0–127), while 0x80–0xFF represent extended characters.
    • ​Examples​​:
      • ASCII: Basic encoding for English, digits, punctuation, and control characters (0x00–0x7F).
      • ISO/IEC 8859 Series (Latin-1, Latin-2, …): Extends ASCII to support Western/Central European characters (0x80–0xFF).
      • Windows-1252: Microsoft’s extension of Latin-1, redefining unused control characters in 0x80–0xFF.
    • ​Advantages​​:
      • Simple and efficient: Fixed one-byte-per-character storage; fast processing.
      • Backward compatibility with ASCII.
    • ​Disadvantages​​:
      • Extremely limited expressiveness: Only 256 characters possible—insufficient for languages like Chinese, Japanese, or Arabic (which require thousands).
  2. ​Multi-Byte Character Sets (MBCS)​
    To address SBCS limitations, MBCS uses variable-length byte sequences. ​​A. Double-Byte Character Sets (DBCS)​
    • ​Design​​: Primarily two bytes per character; sometimes one byte for ASCII.
    • ​Examples​​:
      • Shift JIS (SJIS) (Japanese): Lead bytes (0x81–0x9F, 0xE0–0xFC); trail bytes (0x40–0x7E, 0x80–0xFC).
      • GBK/GB2312 (Simplified Chinese): Lead bytes (0x81–0xFE); trail bytes (0x40–0x7E, 0x80–0xFE).
      • Big5 (Traditional Chinese): Lead bytes (0x81–0xFE); trail bytes (0x40–0x7E, 0xA1–0xFE).
    • ​Advantages​​:
      • Expanded expressiveness: Supports tens of thousands of characters.
      • Backward compatible with ASCII.
    • ​Disadvantages​​:
      • State-dependent parsing: Complexity increases as parsers must track byte context (e.g., ASCII vs. lead byte).
      • Synchronization issues: A missing/inserted byte corrupts subsequent characters until the next ASCII byte.
    ​B. Truly Variable-Length Multi-Byte Encodings​
    • ​Design​​: Characters use 1–4+ bytes, with self-synchronization—any byte’s value indicates whether it starts a new character or continues an existing one.
    • ​Examples​​:
      • UTF-8 (most successful):
        • 1-byte: 0xxxxxxx (0x00–0x7F) — full ASCII compatibility.
        • 2-byte: 110xxxxx 10xxxxxx (Latin/Greek/Cyrillic supplements).
        • 3-byte: 1110xxxx 10xxxxxx 10xxxxxx (most CJK characters).
        • 4-byte: 11110xxx 10xxxxxx 10xxxxxx 10xxxxxx (emoji, rarer CJK).
    • ​Advantages​​:
      • Self-synchronization: Robust parsing from any position.
      • Perfect ASCII compatibility.
      • Massive expressiveness: Covers all Unicode characters (>1 million code points).
      • Efficiency: Matches ASCII storage for English-dominated text.
    • ​Disadvantages​​:
      • Lower storage efficiency for non-ASCII text: e.g., Chinese requires 3 bytes in UTF-8 vs. 2 in UTF-16 (Basic Multilingual Plane).
  3. ​Fixed-Width Multi-Byte Encodings​
    • ​Design​​: Fixed bytes per character (2 or 4).
    • ​Examples​​:
      • UTF-16:
        • Basic Multilingual Plane (BMP) characters: 2 bytes (covers most CJK).
        • Supplementary Planes: 4 bytes (via surrogate pairs).
      • UTF-32: All characters use 4 bytes.
    • ​Advantages​​:
      • UTF-32: Simple fixed-width processing (one character = one integer).
    • ​Disadvantages​​:
      • UTF-16: Not truly fixed-width; ASCII doubles in size (2 bytes).
      • UTF-32: Low storage efficiency (ASCII uses 4× more space than UTF-8).

​Summary​

​Encoding Type​​Bytes/Char​​Design​​Advantages​​Disadvantages​
Single-Byte (ASCII)1FixedSimple, efficient, good compatibilityExtremely limited expressiveness
Single-Byte Extended (Latin-1)1FixedSimple, efficient, ASCII-compatibleLimited expressiveness; language conflicts
Double-Byte (SJIS, GBK)1 or 2Variable (mostly 2)High expressiveness, ASCII-compatibleComplex parsing; sync issues
Variable Multi-Byte (UTF-8)1–4Variable, self-synchronizingSelf-syncing, ASCII-compatible, universalSuboptimal for non-ASCII storage
Fixed Multi-Byte (UTF-16)2 or 4Variable (mostly 2)High BMP efficiencyNot fixed-width; ASCII-inefficient
Fixed Multi-Byte (UTF-32)4FixedProcessing simplicityVery low storage efficiency

​Modern Adoption​
UTF-8 dominates modern text processing (especially in internationalized software and the web) due to its optimal balance of compatibility, expressiveness, and efficiency. UTF-16 is common in systems like Windows, Java, and .NET. UTF-32’s inefficiency limits its use. Legacy SBCS/DBCS encodings persist in older systems.


发表在 linux文章 | 留下评论

文本编码的设计思路

文本编码的设计是一个复杂且历史悠久的过程,旨在用有限的数字(通常是 8 位字节)来表示世界上各种语言的字符。

核心设计思路

文本编码的设计主要围绕以下几个目标:

  1. 表示能力:能够表示目标语言或字符集中的所有字符。
  2. 兼容性:尽可能与已有的标准(尤其是 ASCII)兼容。
  3. 效率:存储和传输的效率,以及处理速度。
  4. 标准化:需要被广泛接受和实现的标准。

主要编码类型及其字节设计

1. 单字节编码 (Single-Byte Character Sets – SBCS)

  • 设计:每个字符用一个字节(8 位)表示。
  • 容量:一个字节有 256 (2^8) 个可能的值。通常 0x00-0x7F 被保留给 ASCII 字符(0-127),剩下 128 个值(0x80-0xFF)用于表示其他字符。
  • 示例
    • ASCII:最基础的编码,只使用 0x00-0x7F 表示英文字符、数字、标点和控制字符。
    • ISO/IEC 8859 系列(Latin-1, Latin-2, …):扩展 ASCII,用 0x80-0xFF 表示西欧、中欧等地区的字符。
    • Windows-1252:Windows 对 Latin-1 的扩展,填充了 0x80-0xFF 中一些在 Latin-1 中未定义或定义为控制字符的位置。
  • 优点
    • 简单高效:字符与字节一一对应,处理速度快,存储空间固定。
    • 向后兼容 ASCII:所有 ASCII 文本也是有效的 Latin-1/Windows-1252 文本。
  • 缺点
    • 表示能力有限:只能表示最多 256 个字符,远远不足以表示像中文、日文、阿拉伯文等拥有成千上万个字符的语言。

2. 多字节编码 (Multi-Byte Character Sets – MBCS)

为了解决单字节编码表示能力不足的问题,多字节编码应运而生。其核心思想是使用变长的字节序列来表示不同的字符。

A. 双字节编码 (Double-Byte Character Sets – DBCS)

  • 设计:主要使用两个字节来表示一个字符,有时也用一个字节表示 ASCII 字符。
  • 示例
    • Shift JIS (SJIS):用于日文。第一个字节(前导字节)在特定范围(如 0x81-0x9F0xE0-0xFC),第二个字节(尾字节)在 0x40-0x7E 或 0x80-0xFC
    • GBK / GB2312:用于简体中文。第一个字节在 0x81-0xFE,第二个字节在 0x40-0x7E 或 0x80-0xFE
    • Big5:用于繁体中文。第一个字节在 0x81-0xFE,第二个字节在 0x40-0x7E 或 0xA1-0xFE
  • 优点
    • 表示能力大增:可以表示几万甚至更多的字符。
    • 向后兼容 ASCII:单个 ASCII 字节(0x00-0x7F)仍然表示 ASCII 字符。
  • 缺点
    • 状态依赖:解析时需要记住前一个字节是 ASCII 还是多字节序列的开始,这使得解析变得复杂且容易出错。
    • 同步问题:如果在传输过程中丢失或插入一个字节,会导致后续所有字符解析错误,直到遇到下一个 ASCII 字符才能重新同步。

B. 真正的变长多字节编码

  • 设计:一个字符可以用 1 个、2 个、3 个甚至更多字节来表示。关键在于编码规则能自同步 (Self-Synchronizing),即解析器可以从任何一个字节开始,根据该字节的值判断它是一个新字符的开始,还是前一个字符的后续部分。
  • 示例
    • UTF-8(最典型和成功):
      • 1 字节字符0xxxxxxx (0x00-0x7F) – 完全兼容 ASCII。
      • 2 字节字符110xxxxx 10xxxxxx – 用于表示拉丁文补充、希腊文、西里尔文等。
      • 3 字节字符1110xxxx 10xxxxxx 10xxxxxx – 用于表示大部分中文、日文、韩文常用字符。
      • 4 字节字符11110xxx 10xxxxxx 10xxxxxx 10xxxxxx – 用于表示 Unicode 较后面平面上的字符(如一些生僻汉字、表情符号等)。
      • 优点
        • 自同步:通过检查字节的前几位(bit pattern),解析器总能知道当前字节是新字符的开始还是延续。
        • 完美兼容 ASCII:所有 ASCII 文本都是有效的 UTF-8 文本。
        • 表示能力极强:可以表示 Unicode 标准中的所有字符(超过 100 万个码位)。
        • 效率高:对于以 ASCII 为主的文本(如英文、编程代码),存储效率与 ASCII 相同。
      • 缺点
        • 对于非 ASCII 字符,存储效率可能低于固定宽度编码:例如,一个中文字符在 UTF-8 中需要 3 个字节,而在 UTF-16 中只需要 2 个字节(基本平面内)或 4 个字节(辅助平面)。

3. 固定宽度多字节编码

  • 设计:每个字符都使用固定数量的字节表示,例如每个字符都用 2 个字节或 4 个字节。
  • 示例
    • UTF-16
      • 基本平面字符 (BMP):使用 2 个字节(16 位)表示,覆盖了大部分常用字符。
      • 辅助平面字符:使用 4 个字节(通过代理对 Surrogate Pair 实现)。
      • 特点:对于 BMP 内的字符(包括大部分中日韩字符),它是固定宽度的。但它不是完全固定宽度的,因为需要代理对来表示辅助平面字符。
    • UTF-32
      • 所有字符:都使用 4 个字节(32 位)表示。
      • 优点:真正的固定宽度,一个字符就是一个整数,处理极其简单。
      • 缺点:存储效率低,即使是 ASCII 字符也要占用 4 个字节。对于以 ASCII 为主的文本,存储空间是 UTF-8 的 4 倍。

总结

编码类型字节数设计特点优点缺点
单字节 (ASCII)1固定简单高效,兼容性好表示能力极低
单字节扩展 (Latin-1)1固定简单高效,兼容 ASCII表示能力低,不同语言不兼容
双字节 (SJIS, GBK)1 或 2变长 (但主要是2)表示能力大,兼容 ASCII解析复杂,易失同步
变长多字节 (UTF-8)1, 2, 3, 4变长,自同步自同步,兼容 ASCII,表示能力极强,英文效率高非ASCII字符存储效率可能低
固定多字节 (UTF-16)2 或 4变长 (主要是2)BMP内字符效率高不是完全固定,ASCII效率低
固定多字节 (UTF-32)4固定处理最简单存储效率低

现代文本处理(尤其是国际化软件和 Web)普遍采用 UTF-8,因为它在兼容性、表示能力和效率之间取得了最佳平衡。而 UTF-16 在一些系统(如 Windows 内部、Java、.NET)中也很常见。UTF-32 由于其存储效率问题,使用较少。传统的 SBCS 和 DBCS 仍然在一些遗留系统中使用。

发表在 linux文章 | 留下评论