Browse Source

update faq.md

Gildas 2 years ago
parent
commit
51bf0f62e9
1 changed files with 1 additions and 1 deletions
  1. 1 1
      faq.md

+ 1 - 1
faq.md

@@ -21,7 +21,7 @@ By default, Chrome extensions are not allowed to access to pages stored on the f
 ## How does the self-extracting ZIP format work?
 The self-extracting ZIP files created by SingleFile are essentially regular ZIP files. They take advantage of the flexibility in the ZIP specification, which allows for additional data to be included before and after the ZIP payload. In the case of SingleFile, this feature is used to make the ZIP file appear as an HTML file. In this way, the data before the ZIP payload represents the start of an HTML file, and the data after the payload represents the end of this HTML file. As a result, this HTML page is technically invalid because it contains binary data (i.e. the ZIP payload), but it's within the bounds of the HTML specification to allow for such cases. Within this HTML page, there is also a script weighting approximately 50KB designed to extract and display the ZIP payload when the page is opened in a web browser.
 
-The purpose of the embedded script is to interpret the ZIP payload as binary data, extract it, and then display the extracted page with its resources. Initially, the script can use the `window.fetch()` method to read the HTML page in binary form and retrieve the ZIP payload. However, this API doesn't work in Chromium-based and WebKit-based browsers when the page is accessed from the local file system due to security restrictions. To circumvent this and when using the universal self-extracting ZIP format, the page is encoded in windows-1251, and binary data is directly retrieved from the Document Object Model (DOM). The choice to use windows-1251 encoding, rather than UTF-8, was made because, in UTF-8, any invalid characters are converted into the "U+FFFD REPLACEMENT CHARACTER," making it impractical for this specific purpose due to a resulting significant data loss. With windows-1251 encoding, all bytes can be successfully recovered. In any case though, all instances of CR (Carriage Return) and CR+LF (Carriage Return Line Feed) bytes are replaced with LF (Line Feed) bytes. As a result, additional data needs to be incorporated into the page to restore this data loss. This task is accomplished by the `<sfz-extra-data>` tag, which contains both the necessary data and the offset specifying the start of the ZIP payload encoded in base64. Finally, because the zip specification tolerates no more than 64KB of random data after the payload, this tag is positioned at the end or beginning of the page (i.e. when it weighs more than 64KB).
+The purpose of the embedded script is to interpret the ZIP payload as binary data, extract it, and then display the extracted page with its resources. Initially, the script can use the `window.fetch()` method to read the HTML page in binary form and retrieve the ZIP payload. However, this API doesn't work in Chromium-based and WebKit-based browsers when the page is accessed from the local file system due to security restrictions. To circumvent this and when using the universal self-extracting ZIP format, the page is encoded in windows-1251, and binary data is directly retrieved from the Document Object Model (DOM). The choice to use windows-1251 encoding, rather than UTF-8, was made because, in UTF-8, any invalid characters are converted into the "U+FFFD REPLACEMENT CHARACTER," making it impractical for this specific purpose due to a resulting significant data loss. With windows-1251 encoding, all bytes can be successfully recovered. In any case though, all instances of CR (Carriage Return) and CR+LF (Carriage Return Line Feed) bytes are replaced with LF (Line Feed) bytes. As a result, additional data needs to be incorporated into the page to restore this data loss. This task is accomplished by the `<sfz-extra-data>` tag, which contains both the necessary data and the offset specifying the start of the ZIP payload encoded in base64. Finally, because the zip specification tolerates no more than 64KB of random data after the ZIP payload, this tag is positioned at the end or beginning of the page (i.e. when it weighs more than 64KB).
 
 ## What are the permissions requested by SingleFile for?
 The permissions requested by SingleFile are defined in the [manifest.json](https://github.com/gildas-lormeau/SingleFile/blob/master/manifest.json) file. Below are the reasons why they are necessary.