二进制数据 - Bun 中文文档

本页面旨在介绍在 JavaScript 中处理二进制数据。Bun 实现了多种用于处理二进制数据的数据类型和工具，其中大部分是 Web 标准的。任何 Bun 特有的 API 都会特别注明。下面是一张快速“备忘单”，也作为目录使用。点击左栏的条目即可跳转到相应章节。

类	描述
`TypedArray`	一组类提供类似数组的接口，用于交互处理二进制数据。包括 `Uint8Array`、`Uint16Array`、`Int8Array` 等。
`Buffer`	`Uint8Array` 的子类，实现了多种便捷方法。与本表中的其他元素不同，这是一个 Node.js API（Bun 也实现了它）。在浏览器中不能使用。
`DataView`	提供 `get/set` 接口，可在特定字节偏移量处读写 `ArrayBuffer` 中的若干字节。常用于读写二进制协议。
`Blob`	通常表示文件的只读二进制数据块。具有 MIME 类型 `type`、大小 `size`，以及转换为 `ArrayBuffer`、`ReadableStream` 和字符串的方法。
`File`	`Blob` 的子类，表示文件。具有 `name` 和 `lastModified` 时间戳。Node.js 20 中支持实验性实现。
`BunFile`	仅在 Bun 中。`Blob` 的子类，表示懒加载的磁盘文件。通过 `Bun.file(path)` 创建。

`ArrayBuffer` 和视图

直到 2009 年，JavaScript 中没有原生语言机制来存储和操作二进制数据。ECMAScript v5 引入了一系列新的机制，最基础的构建块是 ArrayBuffer，它是一个简单的数据结构，表示内存中字节序列。

// 此缓冲区可存储 8 字节
const buf = new ArrayBuffer(8);

尽管名字中有“数组”，但它并不是一个数组，并且不支持任何你可能期望的数组方法和操作符。事实上，无法直接从 ArrayBuffer 读取或写入值。你几乎只能检查它的大小或从中创建“切片”。

const buf = new ArrayBuffer(8);
buf.byteLength; // => 8

const slice = buf.slice(0, 4); // 返回新的 ArrayBuffer
slice.byteLength; // => 4

要进行有趣的操作，需要所谓的“视图”——一个类，用于包装一个 ArrayBuffer 实例，让你可以读取和操作其底层数据。有两种视图：类型化数组 和 DataView。

`DataView`

DataView 类是一个底层接口，用来读取和操作 ArrayBuffer 中的数据。下面创建一个新的 DataView，并将第一个字节设置为 3。

const buf = new ArrayBuffer(4);
// [0b00000000, 0b00000000, 0b00000000, 0b00000000]

const dv = new DataView(buf);
dv.setUint8(0, 3); // 在字节偏移量 0 写入值 3
dv.getUint8(0); // => 3
// [0b00000011, 0b00000000, 0b00000000, 0b00000000]

现在，我们在字节偏移量 1 处写入一个 Uint16。这需要两个字节。我们使用值 513，它是 2 * 256 + 1；对应的字节是 00000010 00000001。

dv.setUint16(1, 513);
// [0b00000011, 0b00000010, 0b00000001, 0b00000000]

console.log(dv.getUint16(1)); // => 513

至此，我们已经对底层 ArrayBuffer 的前三个字节赋值。即使第二和第三字节是通过 setUint16() 创建的，我们仍然可以使用 getUint8() 分别读取它们。

console.log(dv.getUint8(1)); // => 2
console.log(dv.getUint8(2)); // => 1

若尝试写入超过底层 ArrayBuffer 可用空间的数据，将抛出错误。以下代码尝试在偏移 0 处写入一个 Float64（需要 8 字节），而缓冲区只有 4 个字节。

dv.setFloat64(0, 3.1415);
// ^ RangeError: 越界访问

DataView 支持以下方法：

读取方法	写入方法
`getBigInt64()`	`setBigInt64()`
`getBigUint64()`	`setBigUint64()`
`getFloat32()`	`setFloat32()`
`getFloat64()`	`setFloat64()`
`getInt16()`	`setInt16()`
`getInt32()`	`setInt32()`
`getInt8()`	`setInt8()`
`getUint16()`	`setUint16()`
`getUint32()`	`setUint32()`
`getUint8()`	`setUint8()`

`TypedArray`

类型化数组是一组类，提供类似数组的接口，用于访问 ArrayBuffer 中的数据。与 DataView 可以在某个偏移量写入不同大小的数字不同，TypedArray 将底层字节解释为固定大小数字的数组。

通常会将这组类按它们的父类 TypedArray 集合称呼。这个类是 JavaScript 的内部类；你无法直接创建它的实例，TypedArray 也不是全局定义的。可以将其看作接口或抽象类。

const buffer = new ArrayBuffer(3);
const arr = new Uint8Array(buffer);

// 内容初始化为零
console.log(arr); // Uint8Array(3) [0, 0, 0]

// 像数组一样赋值
arr[0] = 0;
arr[1] = 10;
arr[2] = 255;
arr[3] = 255; // 操作无效，越界

虽然 ArrayBuffer 是字节序列，这些类型化数组类会按固定字节大小将字节解释为数字数组。下表列出了类型化数组类，以及它们如何解读 ArrayBuffer 的字节。

类	描述
`Uint8Array`	每 1 字节被解释为无符号 8 位整数。范围是 0 到 255。
`Uint16Array`	每 2 字节被解释为无符号 16 位整数。范围是 0 到 65535。
`Uint32Array`	每 4 字节被解释为无符号 32 位整数。范围是 0 到 4294967295。
`Int8Array`	每 1 字节被解释为有符号 8 位整数。范围是 -128 到 127。
`Int16Array`	每 2 字节被解释为有符号 16 位整数。范围是 -32768 到 32767。
`Int32Array`	每 4 字节被解释为有符号 32 位整数。范围是 -2147483648 到 2147483647。
`Float16Array`	每 2 字节被解释为 16 位浮点数。范围约为 -6.104e5 到 6.55e4。
`Float32Array`	每 4 字节被解释为 32 位浮点数。范围约为 -3.4e38 到 3.4e38。
`Float64Array`	每 8 字节被解释为 64 位浮点数。范围约为 -1.7e308 到 1.7e308。
`BigInt64Array`	每 8 字节被解释为有符号 `BigInt`。范围 -9223372036854775808 到 9223372036854775807（但 `BigInt` 实际上能表示更大数）。
`BigUint64Array`	每 8 字节被解释为无符号 `BigInt`。范围 0 到 18446744073709551615（但 `BigInt` 实际上能表示更大数）。
`Uint8ClampedArray`	与 `Uint8Array` 相同，但赋值时会自动“钳制”元素值至 0-255 区间。

下面的表展示了用不同类型化数组类查看 ArrayBuffer 字节时的字节解释方式：

	字节 0	字节 1	字节 2	字节 3	字节 4	字节 5	字节 6	字节 7
`ArrayBuffer`	`00000000`	`00000001`	`00000010`	`00000011`	`00000100`	`00000101`	`00000110`	`00000111`
`Uint8Array`	0	1	2	3	4	5	6	7
`Uint16Array`	256 (`1*256 + 0`)		770 (`3*256 + 2`)		1284 (`5*256 + 4`)		1798 (`7*256 + 6`)
`Uint32Array`	50462976				117835012
`BigUint64Array`	506097522914230528n

从已有的 ArrayBuffer 创建类型化数组：

// 从 ArrayBuffer 创建类型数组
const buf = new ArrayBuffer(10);
const arr = new Uint8Array(buf);

arr[0] = 30;
arr[1] = 60;

// 所有元素初始化为 0
console.log(arr); // => Uint8Array(10) [ 30, 60, 0, 0, 0, 0, 0, 0, 0, 0 ];

如果尝试用相同 ArrayBuffer 创建 Uint32Array，将报错。

const buf = new ArrayBuffer(10);
const arr = new Uint32Array(buf);
//          ^  RangeError: ArrayBuffer 长度减去字节偏移量
//             不是元素大小的倍数

Uint32 值需要 4 个字节（32 位）。因为 ArrayBuffer 长 10 字节，无法整除为 4 字节块。解决方法是对 ArrayBuffer 的特定“切片”创建类型化数组。下面的 Uint16Array 只“视图”底层 ArrayBuffer 的前 8 字节。我们指定字节偏移量为 0，长度为 2，表示数组包含两个 Uint32 元素。

// 从 ArrayBuffer 切片创建类型化数组
const buf = new ArrayBuffer(10);
const arr = new Uint32Array(buf, 0, 2);

/*
  buf    _ _ _ _ _ _ _ _ _ _    10 字节
  arr   [_______,_______]       2 个 4 字节元素
*/

arr.byteOffset; // 0
arr.length; // 2

你无需显式创建 ArrayBuffer，可以直接指定长度创建类型化数组：

const arr2 = new Uint8Array(5);

// 所有元素初始化为 0
// => Uint8Array(5) [0, 0, 0, 0, 0]

类型化数组也可以直接从数字数组或另一个类型化数组创建：

// 从数字数组
const arr1 = new Uint8Array([0, 1, 2, 3, 4, 5, 6, 7]);
arr1[0]; // => 0;
arr1[7]; // => 7;

// 从另一个类型化数组
const arr2 = new Uint8Array(arr);

总体上，类型化数组与常规数组的方法相似，但部分方法不可用，比如 push 和 pop（因为需要调整底层 ArrayBuffer 大小）。

const arr = new Uint8Array([0, 1, 2, 3, 4, 5, 6, 7]);

// 支持通用数组方法
arr.filter(n => n > 128); // Uint8Array(1) [255]
arr.map(n => n * 2); // Uint8Array(8) [0, 2, 4, 6, 8, 10, 12, 14]
arr.reduce((acc, n) => acc + n, 0); // 28
arr.forEach(n => console.log(n)); // 0 1 2 3 4 5 6 7
arr.every(n => n < 10); // true
arr.find(n => n > 5); // 6
arr.includes(5); // true
arr.indexOf(5); // 5

更详细的属性和方法，请参考 MDN 文档。

`Uint8Array`

特别需要强调的是 Uint8Array，它表示经典的“字节数组”——一组范围在 0 到 255 之间的无符号 8 位整数。这是 JavaScript 中最常见的类型化数组。在 Bun 中（未来可能其他 JavaScript 引擎也会），它带有在字节数组和经过序列化的 base64 或十六进制字符串之间转换的方法。

new Uint8Array([1, 2, 3, 4, 5]).toBase64(); // "AQIDBA=="
Uint8Array.fromBase64("AQIDBA=="); // Uint8Array(4) [1, 2, 3, 4, 5]

new Uint8Array([255, 254, 253, 252, 251]).toHex(); // "fffefdfcfb"
Uint8Array.fromHex("fffefdfcfb"); // Uint8Array(5) [255, 254, 253, 252, 251]

它是 TextEncoder#encode 的返回类型，也是 TextDecoder#decode 的输入类型，这两个工具类设计用于字符串和各种二进制编码相互转换，尤其是 "utf-8"。

const encoder = new TextEncoder();
const bytes = encoder.encode("hello world");
// => Uint8Array(11) [ 104, 101, 108, 108, 111, 32, 119, 111, 114, 108, 100 ]

const decoder = new TextDecoder();
const text = decoder.decode(bytes);
// => hello world

`Buffer`

Bun 实现了 Node.js 的 Buffer，这是一个在 JavaScript 标准中引入类型化数组之前用于操作二进制数据的 API。现在它已重新实现为 Uint8Array 子类，提供多种方法，包括数组方法和类似 DataView 的方法。

const buf = Buffer.from("hello world");
// => Buffer(11) [ 104, 101, 108, 108, 111, 32, 119, 111, 114, 108, 100 ]

buf.length; // => 11
buf[0]; // => 104，字符 'h' 的 ASCII 码
buf.writeUInt8(72, 0); // 写入字符 'H' 的 ASCII 码

console.log(buf.toString());
// => Hello world

完整文档请参见 Node.js 文档。

`Blob`

Blob 是 Web API，常用于表示文件。Blob 最初仅在浏览器实现（与 ArrayBuffer 不同，后者是 JavaScript 的一部分），但现已支持 Node 和 Bun。通常不会直接创建 Blob 实例，更多情况下会从外部来源（例如浏览器中的 <input type="file"> 元素）或库接收 Blob 实例。但也可以通过一个或多个字符串或二进制“blob part”创建 Blob。

const blob = new Blob(["<html>Hello</html>"], {
  type: "text/html",
});

blob.type; // => text/html
blob.size; // => 19

这些部分可以是 string，ArrayBuffer，TypedArray，DataView，或其他 Blob 实例。blob 部分按给定顺序拼接。

const blob = new Blob([
  "<html>",
  new Blob(["<body>"]),
  new Uint8Array([104, 101, 108, 108, 111]), // 二进制表示的 "hello"
  "</body></html>",
]);

Blob 的内容可以异步读取为多种格式。

await blob.text(); // => <html><body>hello</body></html>
await blob.bytes(); // => Uint8Array (复制内容)
await blob.arrayBuffer(); // => ArrayBuffer (复制内容)
await blob.stream(); // => ReadableStream

`BunFile`

BunFile 是 Blob 的子类，用于表示懒加载的磁盘文件。与 File 类似，它附加了 name 和 lastModified 属性。但与 File 不同，它不需要将文件加载到内存。

const file = Bun.file("index.txt");
// => BunFile

`File`

仅浏览器支持。Node.js 20 中实验性支持。

File 是 Blob 的子类，添加了 name 和 lastModified 属性。浏览器中常用于表示通过 <input type="file"> 上传的文件。Node.js 和 Bun 也实现了 File。

// 浏览器环境
// <input type="file" id="file" />

const files = document.getElementById("file").files;
// => File[]

const file = new File(["<html>Hello</html>"], "index.html", {
  type: "text/html",
});

完整文档请查看 MDN 文档。

流

流是一个重要抽象，用于处理二进制数据而无需一次性全部加载到内存。常用于读写文件、发送及接收网络请求、处理大规模数据。 Bun 实现了 Web API 中的 ReadableStream 和 WritableStream。

Bun 还实现了 node:stream 模块，包括 Readable, Writable 和 Duplex。完整文档请参见 Node.js 文档。

创建一个简单的可读流：

const stream = new ReadableStream({
  start(controller) {
    controller.enqueue("hello");
    controller.enqueue("world");
    controller.close();
  },
});

可以用 for await 语法逐块读取流内容。

for await (const chunk of stream) {
  console.log(chunk);
}

// => "hello"
// => "world"

关于 Bun 中流的更多内容，请参阅 API > Streams。

格式转换

把一种二进制格式转换成另一种格式是常见任务，本节作为参考。

从 `ArrayBuffer`

由于 ArrayBuffer 存储了类型化数组等二进制结构的底层数据，以下代码不是转换 ArrayBuffer，而是用底层数据创建新实例。

转为 `TypedArray`

new Uint8Array(buf);

转为 `DataView`

new DataView(buf);

转为 `Buffer`

// 创建覆盖整个 ArrayBuffer 的 Buffer
Buffer.from(buf);

// 创建覆盖 ArrayBuffer 某个切片的 Buffer
Buffer.from(buf, 0, 10);

转为字符串

按 UTF-8 编码：

new TextDecoder().decode(buf);

转为数字数组 (`number[]`)

Array.from(new Uint8Array(buf));

转为 `Blob`

new Blob([buf], { type: "text/plain" });

转为 `ReadableStream`

以下示例创建一个 ReadableStream，并将整个 ArrayBuffer 作为一个 chunk 入队。

new ReadableStream({
  start(controller) {
    controller.enqueue(buf);
    controller.close();
  },
});

分块读取

要分块流式传输 ArrayBuffer，使用 Uint8Array 视图并逐块入队。

const view = new Uint8Array(buf);
const chunkSize = 1024;

new ReadableStream({
  start(controller) {
    for (let i = 0; i < view.length; i += chunkSize) {
      controller.enqueue(view.slice(i, i + chunkSize));
    }
    controller.close();
  },
});

从 `TypedArray`

转为 `ArrayBuffer`

获取底层 ArrayBuffer。注意，类型化数组可能是底层缓冲区的部分视图，因此大小可能不同。

arr.buffer;

转为 `DataView`

创建覆盖相同字节范围的 DataView。

new DataView(arr.buffer, arr.byteOffset, arr.byteLength);

转为 `Buffer`

Buffer.from(arr);

转为字符串

按 UTF-8 编码：

new TextDecoder().decode(arr);

转为数字数组 (`number[]`)

Array.from(arr);

转为 `Blob`

// 仅当 arr 是其整个底层缓冲区视图时有效
new Blob([arr.buffer], { type: "text/plain" });

转为 `ReadableStream`

new ReadableStream({
  start(controller) {
    controller.enqueue(arr);
    controller.close();
  },
});

分块读取

分块流式传输 ArrayBuffer，将类型化数组划分为多块，分别入队。

new ReadableStream({
  start(controller) {
    for (let i = 0; i < arr.length; i += chunkSize) {
      controller.enqueue(arr.slice(i, i + chunkSize));
    }
    controller.close();
  },
});

从 `DataView`

转为 `ArrayBuffer`

view.buffer;

转为 `TypedArray`

仅当 DataView 的 byteLength 是类型化数组元素字节大小的倍数时有效。

new Uint8Array(view.buffer, view.byteOffset, view.byteLength);
new Uint16Array(view.buffer, view.byteOffset, view.byteLength / 2);
new Uint32Array(view.buffer, view.byteOffset, view.byteLength / 4);
// 等等...

转为 `Buffer`

Buffer.from(view.buffer, view.byteOffset, view.byteLength);

转为字符串

按 UTF-8 编码：

new TextDecoder().decode(view);

转为数字数组 (`number[]`)

Array.from(view);

转为 `Blob`

new Blob([view.buffer], { type: "text/plain" });

转为 `ReadableStream`

new ReadableStream({
  start(controller) {
    controller.enqueue(view.buffer);
    controller.close();
  },
});

分块读取

分块流式传输 ArrayBuffer，将 DataView 划分为多块，逐块入队。

new ReadableStream({
  start(controller) {
    for (let i = 0; i < view.byteLength; i += chunkSize) {
      controller.enqueue(view.buffer.slice(i, i + chunkSize));
    }
    controller.close();
  },
});

从 `Buffer`

转为 `ArrayBuffer`

buf.buffer;

转为 `TypedArray`

new Uint8Array(buf);

转为 `DataView`

new DataView(buf.buffer, buf.byteOffset, buf.byteLength);

转为字符串

按 UTF-8：

buf.toString();

按 base64：

buf.toString("base64");

按十六进制：

buf.toString("hex");

转为数字数组 (`number[]`)

Array.from(buf);

转为 `Blob`

new Blob([buf], { type: "text/plain" });

转为 `ReadableStream`

new ReadableStream({
  start(controller) {
    controller.enqueue(buf);
    controller.close();
  },
});

分块读取

分块流式传输 ArrayBuffer，将 Buffer 划分多块，逐块入队。

new ReadableStream({
  start(controller) {
    for (let i = 0; i < buf.length; i += chunkSize) {
      controller.enqueue(buf.slice(i, i + chunkSize));
    }
    controller.close();
  },
});

从 `Blob`

转为 `ArrayBuffer`

Blob 提供了便捷方法。

await blob.arrayBuffer();

转为 `TypedArray`

await blob.bytes();

转为 `DataView`

new DataView(await blob.arrayBuffer());

转为 `Buffer`

Buffer.from(await blob.arrayBuffer());

转为字符串

按 UTF-8：

await blob.text();

转为数字数组 (`number[]`)

Array.from(await blob.bytes());

转为 `ReadableStream`

blob.stream();

从 `ReadableStream`

常用 Response 作为中间体，方便将 ReadableStream 转为其他格式。

stream; // ReadableStream

const buffer = new Response(stream).arrayBuffer();

但此方法冗长且增加不必要性能开销。Bun 提供一组优化后的便捷函数，用于将 ReadableStream 转换成多种二进制格式。

转为 `ArrayBuffer`

// 使用 Response
new Response(stream).arrayBuffer();

// 使用 Bun 函数
Bun.readableStreamToArrayBuffer(stream);

转为 `Uint8Array`

// 使用 Response
new Response(stream).bytes();

// 使用 Bun 函数
Bun.readableStreamToBytes(stream);

转为 `TypedArray`

// 使用 Response
const buf = await new Response(stream).arrayBuffer();
new Int8Array(buf);

// 使用 Bun 函数
new Int8Array(Bun.readableStreamToArrayBuffer(stream));

转为 `DataView`

// 使用 Response
const buf = await new Response(stream).arrayBuffer();
new DataView(buf);

// 使用 Bun 函数
new DataView(Bun.readableStreamToArrayBuffer(stream));

转为 `Buffer`

// 使用 Response
const buf = await new Response(stream).arrayBuffer();
Buffer.from(buf);

// 使用 Bun 函数
Buffer.from(Bun.readableStreamToArrayBuffer(stream));

转为字符串

按 UTF-8：

// 使用 Response
await new Response(stream).text();

// 使用 Bun 函数
await Bun.readableStreamToText(stream);

转为数字数组 (`number[]`)

// 使用 Response
const arr = await new Response(stream).bytes();
Array.from(arr);

// 使用 Bun 函数
Array.from(new Uint8Array(Bun.readableStreamToArrayBuffer(stream)));

Bun 提供了用于解析 ReadableStream 为其块数组的工具。每个块可能是字符串、类型化数组或 ArrayBuffer。

// 使用 Bun 函数
Bun.readableStreamToArray(stream);

转为 `Blob`

new Response(stream).blob();

转为 `ReadableStream`

要将一个 ReadableStream 分割为两个可独立消费的流：

const [a, b] = stream.tee();

开始使用

核心运行时

文件与模块系统

HTTP 服务器

网络通信

数据与存储

并发

进程与系统

交互与工具链

实用工具

标准与兼容性

贡献

​ArrayBuffer 和视图

​DataView

​TypedArray

​Uint8Array

​Buffer

​Blob

​BunFile

​File

​流

​格式转换

​从 ArrayBuffer

​转为 TypedArray

​转为 DataView

​转为 Buffer

​转为字符串

​转为数字数组 (number[])

​转为 Blob

​转为 ReadableStream

​从 TypedArray

​转为 ArrayBuffer

​转为 DataView

​转为 Buffer

​转为字符串

​转为数字数组 (number[])

​转为 Blob

​转为 ReadableStream

​从 DataView

​转为 ArrayBuffer

​转为 TypedArray

​转为 Buffer

​转为字符串

​转为数字数组 (number[])

​转为 Blob

​转为 ReadableStream

​从 Buffer

​转为 ArrayBuffer

​转为 TypedArray

​转为 DataView

​转为字符串

​转为数字数组 (number[])

​转为 Blob

​转为 ReadableStream

​从 Blob

​转为 ArrayBuffer

​转为 TypedArray

​转为 DataView

​转为 Buffer

​转为字符串

​转为数字数组 (number[])

​转为 ReadableStream

​从 ReadableStream

​转为 ArrayBuffer

​转为 Uint8Array

​转为 TypedArray

​转为 DataView

​转为 Buffer

​转为字符串

​转为数字数组 (number[])

​转为 Blob

​转为 ReadableStream

`ArrayBuffer` 和视图

`DataView`

`TypedArray`

`Uint8Array`

`Buffer`

`Blob`

`BunFile`

`File`

流

格式转换

从 `ArrayBuffer`

转为 `TypedArray`

转为 `DataView`

转为 `Buffer`

转为字符串

转为数字数组 (`number[]`)

转为 `Blob`

转为 `ReadableStream`

从 `TypedArray`

转为 `ArrayBuffer`

转为 `DataView`

转为 `Buffer`

转为字符串

转为数字数组 (`number[]`)

转为 `Blob`

转为 `ReadableStream`

从 `DataView`

转为 `ArrayBuffer`

转为 `TypedArray`

转为 `Buffer`

转为字符串

转为数字数组 (`number[]`)

转为 `Blob`

转为 `ReadableStream`

从 `Buffer`

转为 `ArrayBuffer`

转为 `TypedArray`

转为 `DataView`

转为字符串

转为数字数组 (`number[]`)

转为 `Blob`

转为 `ReadableStream`

从 `Blob`

转为 `ArrayBuffer`

转为 `TypedArray`

转为 `DataView`

转为 `Buffer`

转为字符串

转为数字数组 (`number[]`)

转为 `ReadableStream`

从 `ReadableStream`

转为 `ArrayBuffer`

转为 `Uint8Array`

转为 `TypedArray`

转为 `DataView`

转为 `Buffer`

转为字符串

转为数字数组 (`number[]`)

转为 `Blob`

转为 `ReadableStream`