DomCloneResource represents an HTML document, but works on a clone of the DOM it was given.

It allows a web page to be snapshotted in its current state, after which modifications to the original DOM or the clone do not influence the other.

The original document is cloned at construction time. Its frames can be (recursively) cloned with cloneFramedDocs.

See its parent class DomResource for further info.

Example

const domResource = new DomCloneResource(window.document)
domResource.cloneFramedDocs(true)

Hierarchy

Constructors

  • Parameters

    • originalDoc: Document

      The Document to clone.

    • Optional url: `${string}:${string}`

      Since the passed Document already has a property doc.URL, the url parameter is optional; if passed it will override the value of doc.URL when determining the target of relative URLs.

    • config: GlobalConfig = {}

      Optional environment configuration.

    Returns DomCloneResource

Accessors

  • get doc(): Document
  • The clone of the Document.

    Returns Document

  • get originalDoc(): Document
  • The Document that was cloned.

    Returns Document

  • get url(): `${string}:${string}`
  • URL of the resource.

    Returns `${string}:${string}`

  • get blob(): Blob
  • A Blob with the current resource content.

    Returns Blob

  • get string(): string
  • The DOM as a string (the document's outerHTML).

    Returns string

  • Get the Links that are found in the document; both those links defined by attributes (e.g. an <a>’s href or <img>’s src) and those defined in inline CSS (e.g. a background: url(…) in a <style> element).

    It includes links directly contained in the document itself, as well as in its iframes with a srcdoc attribute (because such iframes are not treated as subresources).

    The target of a Link can be modified, which updates the resource content accordingly.

    However, even though the content of a link is ‘live’ (i.e. its target is read and written directly from/to the DOM), the list of links is created only at the construction of the DomResource. Thus, if the DOM is modified afterwards, any newly created links will be missing from this list.

    Returns HtmlLink[]

  • get linksInDom(): HtmlLink[]
  • The links contained directly in the DOM itself.

    Note this excludes links contained within iframes. However, this includes links contained in inline stylesheets (<style> elements and style attributes, but not <link>ed stylesheets).

    Returns HtmlLink[]

  • get iframeSrcDocs(): DomResource[]
  • A list of DomResources corresponding to documents in iframes with the srcdoc attribute (note that these documents are not considered subresources).

    Returns DomResource[]

  • An array of Links (a subset of links), containing only subresource links, and for whose subresourceType a Resource subclass exists. That is, those links that fromLink accepts.

    Returns SubresourceLink[]

Methods

  • Get the original node corresponding to a given node in the document clone.

    Type Parameters

    • T extends Node = Node

    Parameters

    • nodeInClone: T

    Returns T

  • Get the cloned node corresponding to a given node in the original document.

    Type Parameters

    • T extends Node = Node

    Parameters

    • nodeInOriginal: T

    Returns T

  • Create a DomCloneResource for each document in an (i)frame in the original document.

    The created clones are associated with the (i)frame elements. To access a clone, use getContentDocOfFrame.

    Parameters

    • deep: boolean = false

      If true, also clone any frames inside the frames, recursively.

    Returns void

  • Get the clone of the framed Document for a given (i)frame element.

    As cloning a DOM does not clone the documents inside its frames (the contentDocument of a frame in the cloned resource is null), this method lets you obtain a clone of the framed document.

    On the first invocation, the frame content is cloned from the original document. Subsequent invocations will return this same object.

    Returns

    A clone of the document in the frame.

    Parameters

    • frameElement: FrameElement

      The frame element for which to get the inner document. Either the frame of the original or of the cloned document can be passed.

    Returns null | DomCloneResource

  • Make the DOM ‘dry’: try make its HTML represent its current state as accurately as possible.

    Drying performs several transformations:

    Returns void

  • Update the srcdoc value of <iframe>s that have it, to have it reflect the current state of the DOM inside the frame (thus including any changes made after the frame contents were loaded, by either freeze-dry or other scripts).

    Returns void

  • Create a DomResource from a Blob of HTML and a URL.

    Example

    const response = await fetch('https://example.org/page.html')
    const domResource = DomResource.fromBlob({ blob: await response.blob(), url: response.url })

    Returns

    A new DomResource, created by parsing the given HTML.

    Parameters

    • __namedParameters: { blob: Blob; url: `${string}:${string}`; config?: GlobalConfig }
      • blob: Blob
      • url: `${string}:${string}`
      • Optional config?: GlobalConfig

    Returns Promise<DomResource>

  • Make ‘outward’ links absolute, and ‘within-document’ links relative (e.g. href="#top").

    Returns void

  • Fetch the resource a given link points to, and return it as a Resource.

    This method does not modify the given link; the caller can store the created Resource in link.resource, to grow a tree of links and resources.

    Example

    link.resource = await Resource.fromLink(link)
    

    Returns

    The newly created Resource.

    Parameters

    • link: SubresourceLink

      The link pointing to the resource.

    • config: GlobalConfig & { fetchResource?: Fetchy; signal?: AbortSignal } = {}

      Optional environment configuration.

    Returns Promise<Resource>

  • Determine the Resource subclass to use for the given subresource type.

    Returns

    The appropriate Resource subclass, or undefined if the type is not supported.

    Parameters

    • subresourceType: undefined | SubresourceType

      The type of subresource expected by the parent resource, e.g. 'image' or 'style'. Note this is not the same as its MIME type.

    Returns undefined | ResourceFactory