소스 검색

update faq.md

Gildas 2 년 전
부모
커밋
f5b17a0b3d
1개의 변경된 파일2개의 추가작업 그리고 2개의 파일을 삭제
  1. 2 2
      faq.md

+ 2 - 2
faq.md

@@ -21,9 +21,9 @@ By default, Chrome extensions are not allowed to access to pages stored on the f
 ## How does the self-extracting ZIP format work?
 The self-extracting ZIP files created by SingleFile are essentially regular ZIP files. They take advantage of the flexibility in the ZIP specification, which allows for additional data to be included before and after the ZIP payload. In the case of SingleFile, this feature is used to make the ZIP file appear as an HTML file. In this way, the data before the ZIP payload represents the start of an HTML file, and the data after the payload represents the end of this HTML file. The resulting HTML page is technically invalid because it contains binary data (i.e. the ZIP payload), but it's within the bounds of the HTML specification to allow for such cases. The ZIP payload is wrapped in tags appropriate to its content. By default, it's stored in a comment node (i.e. wrapped by `<!--` and `-->`). However, if the payload contains the closing tag (i.e. `-->`), then the content is wrapped in another pairs of tags (i.e. tags of `noscript` or `script` or `xmp` or `plaintext` elements) whose closing tag does not conflict with the payload. Within this HTML page, there is also a script weighting approximately 50KB designed to extract and display the ZIP payload when the page is opened in a web browser.
 
-The purpose of the embedded script is to interpret the ZIP payload as binary data, extract it, and then display the extracted page with its resources. Initially, the script can use the `window.fetch()` method to read the HTML page in binary form and retrieve the ZIP payload. However, this API doesn't work in Chromium-based and WebKit-based browsers when the page is accessed from the local file system due to security restrictions. To circumvent this, the page is encoded in windows-1251, and binary data is directly retrieved from the Document Object Model (DOM) when using the "universal" self-extracting ZIP format. The choice to use windows-1251 encoding, rather than UTF-8, was made because, in UTF-8, any invalid bytes are converted into the `U+FFFD REPLACEMENT CHARACTER`, making it impractical for this specific purpose due to a resulting significant data loss. With windows-1251 encoding, all bytes can be successfully recovered. 
+The purpose of the embedded script is to interpret the ZIP payload as binary data, extract it, and then display the extracted page with its resources. Initially, the script can use the `window.fetch()` method to read the HTML page in binary form and retrieve the ZIP payload. However, this API doesn't work in Chromium-based and WebKit-based browsers when the page is accessed from the local file system due to security restrictions. To circumvent this, the page is encoded in `windows-1251`, and binary data is directly retrieved from the Document Object Model (DOM) when using the "universal" self-extracting ZIP format. The choice to use `windows-1251` encoding, rather than `UTF-8`, was made because, in `UTF-8`, any invalid bytes are converted into the `U+FFFD REPLACEMENT CHARACTER`, making it impractical for this specific purpose due to a resulting significant data loss. With windows-1251 encoding, all bytes can be successfully recovered. 
 
-Regardless of page encoding, all instances of CR (Carriage Return) and CR+LF (Carriage Return and Line Feed) bytes are replaced with LF (Line Feed) bytes when read from the DOM. As a consequence, additional data needs also to be incorporated into the page to restore this data loss. This task is accomplished by the `<sfz-extra-data>` tag, which contains both the necessary data and the offset specifying the start of the ZIP payload encoded in base64. The data in this tag is read by the embedded script before extracting the ZIP payload in order to restore CR (Carriage Return) and CR+LF (Carriage Return and Line Feed) bytes. Finally, because the zip specification tolerates no more than 64KB of random data after the ZIP payload, this tag is positioned at the end or beginning of the HTML page (i.e. when it weighs more than 64KB).
+Regardless of page encoding, all instances of `CR` (Carriage Return) and `CR+LF` (Carriage Return and Line Feed) bytes are replaced with LF (Line Feed) bytes when read from the DOM. As a consequence, additional data needs also to be incorporated into the page to restore this data loss. This task is accomplished by the `sfz-extra-data` element, which contains both the necessary data and the offset specifying the start of the ZIP payload encoded in base64. The data in this tag is read by the embedded script before extracting the ZIP payload in order to restore `CR` (Carriage Return) and `CR+LF` (Carriage Return and Line Feed) bytes. Finally, because the zip specification tolerates no more than 64KB of random data after the ZIP payload, this tag is positioned at the end or beginning of the HTML page (i.e. when it weighs more than 64KB).
 
 ## What are the permissions requested by SingleFile for?
 The permissions requested by SingleFile are defined in the [manifest.json](https://github.com/gildas-lormeau/SingleFile/blob/master/manifest.json) file. Below are the reasons why they are necessary.