HTML5 Accessibility Chops: the figure and figcaption elements
The figure and figcaption elements are 2 of the new elements in HTML5. Together they provide the promise of being able to mark-up, with meaning, the structure and relationship between a piece of content and associated content that acts as a descriptive label. Currently as implemented in browsers the semantics of figure and figcaption are practically non existent.
What the HTML5 specification says
figureelement represents some flow content, optionally with a caption, that is self-contained and is typically referenced as a single unit from the main flow of the document.
The element can thus be used to annotate illustrations, diagrams, photos, code listings, etc, that are referred to from the main content of the document, but that could, without affecting the flow of the document, be moved away from that primary content, e.g. to the side of the page, to dedicated pages, or to an appendix.
figcaptionelement represents a caption or legend for the rest of the contents of the
figureelement, if any.
Current practical meaning conveyed by elements in the example:
All very interesting but what can I as a developer do now?
For the general use cases, until the semantics of
figcaption have been implemented in browsers and AT it is suggested that:
- Use a descriptive word at the start of the
figcaptioncontent to give users an idea of what the content is labelling something, for example “Figure X:” or “Chart Y:”
- Be consistent in your
figcaptionlabelling within and across pages.
- Place the figcaption (in the code) before the content to be labelled so it is announced prior to the content it is labelling.
- For example of use with images refer to HTML5: Techniques for providing useful text alternatives:
The background for these recommendations
How to convey the semantics?
The semantics of figcaption can be conveyed visually by the placement of the figcaption, above or below the content it labels and through its proximity to such content. From observations of how figures etc are currently marked up, in some cases the figure element semantics will not be indicated visually, though it may be indicated as part of the figcaption text and/or by the addition of a border or background color. Such visual indications do not provide much value for users who cannot make use of them. While proximity provides some indication of a semantic relationship it alone does not suffice.
The standard method to convey semantics to assistive technology (AT) is by the use of defined roles and relationships provided by Accessibility API’s. These roles and relationships are typically mapped to HTML elements by the browser and AT accesses the information from the API exposed by the browser. A problem arises with figure and figcaption, because figure does not have a specified role and while figcaption can be mapped to a caption role in some Accessibility API’s others do not provide this role. Element names can be passed through accessibility API properties, but this does not confer a defined accessibility semantic for a given element, thus no common definition of what a particular element is and does is provided, this can and does lead to interoperability issues across browsers and AT. Making it much harder for both users and developers to realize a common user experience across software, devices and platforms.
ARIA to the rescue?
ARIA can help, but does not offer a complete solution:
- It does not include a
- It does not include a
aria-describedbymay be used to associate figcaption content with figure content, but their use does not provide the role semantics to differentiate the
figcaptionsemantics from the standard labelling methods of the title attribute and in the case of images the alt attribute.
In order for ARIA to really help it is suggested that the addition of 2 new roles may be required:
The object contains descriptive information, usually textual, about another user interface element such as a table, chart, or image.
The object is a container for a user interface element such as a table, chart, or image and a caption which labels the element.
Whether the additonal roles are needed depends on what will provide the best user experience. Do users want to be made aware of both structures? Should the figcaption content be associated with the figure or the content it contains? Should none , one or both of the structures be voiced by AT? Should the caption always be announced prior to the figure content or after or depend on the caption placement (before/after)?
The following scenarios are also available on a test page which has the role information included inline to simulate what would be available to the AT user for each scenario.
The presence of both figure and caption are announced, the figure start and end are voiced. the caption is announced before the content. (Simulates the figure being labelled by the figcaption)
The presence of figure but not caption is announced, the figure start and end are voiced. the caption content is announced before the figure content. (Simulates the figure being labelled by the figcaption)
The presence of caption but not figure is announced, the caption content is announced before the figure content. (Simulates the figure content being labelled by the figcaption)
The presence of caption but not figure is announced, the caption content is announced after the figure content. (Simulates the figure being labelled by the figcaption)
The presence of caption or figure is not announced, the caption content is announced dependent on the placement in the code (before/after)
Note: Scenario 5 is what users currently experience.
Code example for all scenarios
AT output example Scenario 1
AT output example Scenario 2
AT output example Scenario 3
AT output example Scenario 4
AT output example Scenario 5
What do users want?
I have coded a test page with examples, from the scenarios above, simulating what information could be announced and ordered, please give it a try in your favourite AT and provide comments.