Gildas 1 год назад
Родитель
Сommit
c16dea5973
1 измененных файлов с 1 добавлено и 7 удалено
  1. 1 7
      faq.md

+ 1 - 7
faq.md

@@ -19,13 +19,7 @@ These elements need JavaScript to work properly. By default, SingleFile removes
 By default, Chrome extensions are not allowed to access to pages stored on the filesystem. Therefore, you must enable the option "Allow access to file URLs" in the extension page to display the infobar when viewing a saved page, or to save a page stored on the filesystem.
 
 ## How does the self-extracting ZIP format work?
-The self-extracting ZIP files created by SingleFile are essentially regular ZIP files. They take advantage of the flexibility in the ZIP specification, which allows for additional data to be included before and after the ZIP payload. In the case of SingleFile, this feature is used to make the ZIP file appear as an HTML file. In this way, the data before the ZIP payload represents the start of an HTML file, and the data after the payload represents the end of this HTML file. The resulting HTML page is technically invalid because it contains binary data (i.e. the ZIP payload), but it's within the bounds of the HTML specification to allow for such cases. Within this HTML page, there is also an embedded script weighting approximately 50KB designed to extract and display the ZIP payload when the file is opened in a web browser and interpreted as a web page.
-
-By default, the ZIP payload is wrapped in `<!--` and `-->` tags. However, if the payload contains the closing tag `-->`, then it is wrapped in another pairs of tags (i.e. tags of `noscript` or `script` or `xmp` or `plaintext` elements) whose closing tag does not conflict with the payload.
-
-The purpose of the embedded script is to read the ZIP payload as binary data, extract it, and then display the extracted page with its resources. Initially, the script can use the `window.fetch()` method to read the HTML page in binary form and retrieve the ZIP payload. However, this API doesn't work in Chromium-based and WebKit-based browsers when the page is accessed from the local file system due to security restrictions. To circumvent this, the page is encoded in `windows-1251`, and binary data is directly retrieved from the Document Object Model (DOM) when using the "universal" self-extracting ZIP format. The choice of `windows-1251` encoding is preferred over `UTF-8` because it preserves all bytes without significant data loss.
-
-Regardless of page encoding, all instances of `CR` (Carriage Return) and `CR+LF` (Carriage Return and Line Feed) bytes are replaced with `LF` (Line Feed) bytes when read from the DOM. As a consequence, additional data needs also to be incorporated into the page to restore this data loss when using the "universal" self-extracting ZIP format. This task is accomplished by the `sfz-extra-data` element which contains this data encoded in base64. The data in this element is read by the embedded script before extracting the ZIP payload in order to restore `CR` (Carriage Return) and `CR+LF` (Carriage Return and Line Feed) bytes. Finally, because the zip specification tolerates no more than 64KB of random data after the ZIP payload, this element is positioned at the end or beginning of the HTML page (i.e. when it weighs more than 64KB).
+The self-extracting ZIP files created by SingleFile are essentially regular ZIP files. They take advantage of the flexibility in the ZIP specification, which allows for additional data to be included before and after the ZIP payload. See this [presentation](https://github.com/gildas-lormeau/Polyglot-HTML-ZIP-PNG) for more info.
 
 ## What are the permissions requested by SingleFile for?
 The permissions requested by SingleFile are defined in the [manifest.json](https://github.com/gildas-lormeau/SingleFile/blob/master/manifest.json) file. Below are the reasons why they are necessary.