This example contains several levels of the building block view - ranging from the overview level-1 down to very specific and low-level-3.

5.1 Whitebox HtmlSanityChecker

Whitebox (HtmlSC)

Rationale: We used functional decomposition to separate responsibilities:

  • HSC Core shall encapsulate checking logic and HTML parsing/processing.
  • Plugins and GraphicalUI encapsulate all usage aspects

Contained Blackboxes:

Building block Description
HSC Core HTML parsing and sanity checking
HSC Gradle Plugin Exposes HtmlSC via a standard Gradle plugin, as described in the Gradle user guide. Source: Package org.aim42.htmlsanitycheck, classes: HtmlSanityCheckPlugin and HtmlSanityCheckTask
NetUtil package org.aim42.inet, checks for internet connectivity, configuration of http status codes
FileUtil package org.aim42.filesystem, file extensions etc.
HSC Graphical UI (planned, not implemented)

5.2 Building Blocks - Level 2

5.2.1 HSC Core (Whitebox)

HSC-Core (Whitebox)

Rationale: The internal structure of HSC Core follows a functional decomposition:

  • configuration,
  • parsing and handling HTML input,
  • checking,
  • creating suggestions and
  • collecting checking results

Contained Blackboxes:

Building block Description
Checker Contains the pure checking functionality. See its blackbox description below.
AllChecksRunner Facade to the Checkers. Provides a (configurable) interface. Source: org.aim42.htmlsanitycheck.AllChecksRunner. Called by HSC GradlePlugin
Configuration Handles configuration of input and output location, timeouts, status-code behavior and types of checks to be performed.
Reporter Reports checking results to either console or file.
Suggester In case of checking issues, suggests alternatives (did you mean xyz?). Suggestions are included in results. Checker (Blackbox)

The abstract class Checker provides the uniform interface (public void check()) to different checking algorithms.

Based upon polymorphism, the actual checking is handled by subclasses of the abstract Checkerclass, uses the template-method pattern. It uses the concept of extensible checking algorithms. Suggester (Blackbox)

For a given input (target), Suggester searches within a set of possible values (options) to find the n most similar values. For example:

  • Target = “McDown”
  • Options = {“McUp”, “McDon”, “Mickey”}
  • The resulting suggestion would be “McDon”, because it has the greatest similarity to the target “McDown”.

The implementation is based upon the Jaro-Winkler distance, one of the algorithms to calculate similarity between strings.

Suggester is used in the following cases:

  • Broken image links: Compares the name of the missing image with all available image file names to find the closest match.
  • Missing cross references (broken internal links): Compares the broken link with all available link targets (anchors).

Source: package org.aim42.htmlsanitycheck.suggest.Suggester

5.3 Building Blocks - Level 3

5.3.1 ResultsCollector (Whitebox)

Results Collector (Whitebox)

Rationale: This structures follows the hierarchy of checks, managing results for:

  1. a number of pages/documents (PerRunResults),
  2. a single HTML page (SinglePageResults) and finally
  3. the results of a single check, e.g. the MissingImagesChecker (SingleCheckResults).

Contained Blackboxes:

Building block Description
Per-Run Results Aggregated results for potentially many HTML pages/documents.
SinglePageResults Aggregated results for a single HTML page
SingleCheckResults Results for a single type of check (e.g. missing-images check or broken-internal-link check)
Finding A single finding, (e.g. “image ‘logo.png’ missing”). Can contain suggestions.