Fix formatting
This commit is contained in:
parent
92d3b9bd5b
commit
071cf90549
@ -1,98 +1,98 @@
|
||||
\newcommand\Luamml{\pkg{Luamml}}
|
||||
\newcommand\luamml{\pkg{luamml}}
|
||||
\newcommand\xmltag[1]{\texttt{<#1>}}
|
||||
\section{\Luamml's representation of XML and MathML}
|
||||
In the following I assume basic familiarity with both Lua\TeX's representation of math noads and MathML.
|
||||
|
||||
\subsection{Representation of XML elements}
|
||||
In many places, \luamml\ passes around XML elements. Every element is represented by a Lua table.
|
||||
Element \texttt 0 must always be present and is a string representing the tag name.
|
||||
The positive integer elements of the table represent child elements (either strings for direct text content or nested tables for nested elements).
|
||||
All string members which do not start with a colon are attributes, whose value is the result of applying \texttt{tostring} to the field value.
|
||||
This implies that these values should almost always be strings, except that the value \texttt 0 (since it never needs a unit) can often be set as a number.
|
||||
For example the XML document
|
||||
\begin{verbatim}
|
||||
<math block="display">
|
||||
<mn>0</mn>
|
||||
<mo> < </mo>
|
||||
<mi mathvariant="normal">x</mi>
|
||||
</math>
|
||||
\end{verbatim}
|
||||
would be represented by the Lua table
|
||||
\begin{verbatim}
|
||||
{[0] = "math", block="display",
|
||||
{[0] = "mn", "0"},
|
||||
{[0] = "mo", "<"},
|
||||
{[0] = "mi", mathvariant="normal", "x"}
|
||||
}
|
||||
\end{verbatim}
|
||||
|
||||
\subsection{Expression cores}
|
||||
MathML knows the concept of \enquote{embellished operators}:
|
||||
\begin{blockquote}
|
||||
The precise definition of an \enquote{embellished operator} is:
|
||||
\begin{itemize}
|
||||
\item an \xmltag{mo} element;
|
||||
\item or one of the elements \xmltag{msub}, \xmltag{msup}, \xmltag{msubsup}, \xmltag{munder}, \xmltag{mover}, \xmltag{munderover}, \xmltag{mmultiscripts}, \xmltag{mfrac}, or \xmltag{semantics} (§ 5.1 Annotation Framework), whose first argument exists and is an embellished operator;
|
||||
\item or one of the elements \xmltag{mstyle}, \xmltag{mphantom}, or \xmltag{mpadded}, such that an mrow containing the same arguments would be an embellished operator;
|
||||
\item or an \xmltag{maction} element whose selected sub-expression exists and is an embellished operator;
|
||||
\item or an \xmltag{mrow} whose arguments consist (in any order) of one embellished operator and zero or more space-like elements.
|
||||
\end{itemize}
|
||||
\end{blockquote}
|
||||
For every embellished operator, MathML calls the \xmltag{mo} element defining the embellished operator the \enquote{core} of the embellished operator.
|
||||
|
||||
\Luamml\ makes this slightly more general: Every expression is represented by a pair of two elements: The expression and it's core.
|
||||
The core is always a \xmltag{mo}, \xmltag{mi}, or \xmltag{mn}, \texttt{nil} or s special marker for space like elements.
|
||||
|
||||
If and only if the element is a embellished operator the core is a \xmltag{mo} element representing the core of the embellished operator.
|
||||
The core is a \xmltag{mi} or a \xmltag{mn} element if and only if the element would be an embellished operator with this core if this element where a \xmltag{mo} element.
|
||||
The core is the special space like marker for space like elements. Otherwise the core is \texttt{nil}.
|
||||
|
||||
\subsection{Translation of math noads}
|
||||
A math lists can contain the following node types: noad, fence, fraction, radical, accent, style, choice, ins, mark, adjust, boundary, whatsit, penalty, disc, glue, and kern. The \enquote{noads}
|
||||
|
||||
\subsubsection{Translation of kernel noads}
|
||||
The math noads of this list contain nested kernel noads. So in the first step, we look into how kernel nodes are translated to math nodes.
|
||||
|
||||
\paragraph{\texttt{math_char} kernel noads}
|
||||
First the family and character value in the \texttt{math_char} are used to lookup the Unicode character value of this \texttt{math_char}.
|
||||
(For \texttt{unicode-math}, this is usually just the character value. Legacy maths has to be remapped based on the family.)
|
||||
Then there are two cases: The digits \texttt{0} to \texttt{9} are mapped to \xmltag{mn} elements, everything else becomes a \xmltag{mi} element with \texttt{mathvariant} set to \texttt{normal}.
|
||||
(The \texttt{mathvariant} value might get suppressed if the character defaults to mathvariant \texttt{normal}.)
|
||||
In either case, the \texttt{tex:family} attribute is set to the family number if it's not \texttt{0}.
|
||||
|
||||
The core is always set to the expression itself. E.g.\ the \texttt{math_char} kernel noad \verb+\fam3 a+ would become (assuming no remapping for this family)
|
||||
\begin{verbatim}
|
||||
{[0] = 'mi',
|
||||
mathvariant = 'normal',
|
||||
["tex:family"] = 3,
|
||||
"a"
|
||||
}
|
||||
\end{verbatim}
|
||||
|
||||
\subsubsection{\texttt{sub_box} kernel noads}
|
||||
I am open to suggestions how to convert them properly.
|
||||
|
||||
\subsubsection{\texttt{sub_mlist} kernel noads}
|
||||
The inner list is converted as a \xmltag{mrow} element, with the core being the core of the \xmltag{mrow} element. See the rules for this later.
|
||||
|
||||
\subsubsection{\texttt{delim} kernel noads}
|
||||
If the \texttt{small_char} is zero, these get converted as space like elements of the form
|
||||
\begin{verbatim}
|
||||
{[0] = 'mspace',
|
||||
width = '1.196pt',
|
||||
}
|
||||
\end{verbatim}
|
||||
where 1.196 is replaced by the current value of \verb+\nulldelimiterspace+ converted to \texttt{bp}.
|
||||
|
||||
Otherwise the same rules as for \texttt{math_char} apply,
|
||||
except that instead of \texttt{mi} or \xmltag{mn} elements,
|
||||
\texttt{mo} elements are generated,
|
||||
\texttt{mathvariant} is never set,
|
||||
\texttt{stretchy} is set to \texttt{true} if the operator is not on the list of default stretchy operators in the MathML specification
|
||||
nd \texttt{lspace} and \texttt{rspace} attributes are set to zero.
|
||||
|
||||
\subsubsection{\texttt{acc} kernel noads}
|
||||
Depending on the surrounding element containing the \texttt{acc} kernel noad, it is either stretchy or not.
|
||||
If it's stretchy, the same rules as for \texttt{delim} apply, except that \texttt{lspace} and \texttt{rspace} are not set.
|
||||
Otherwise the \texttt{stretchy} attribute is set to false if the operator is on the list of default stretchy operators.
|
||||
% \newcommand\Luamml{\pkg{Luamml}}
|
||||
% \newcommand\luamml{\pkg{luamml}}
|
||||
% \newcommand\xmltag[1]{\texttt{<#1>}}
|
||||
% \section{\Luamml's representation of XML and MathML}
|
||||
% In the following I assume basic familiarity with both Lua\TeX's representation of math noads and MathML.
|
||||
%
|
||||
% \subsection{Representation of XML elements}
|
||||
% In many places, \luamml\ passes around XML elements. Every element is represented by a Lua table.
|
||||
% Element \texttt 0 must always be present and is a string representing the tag name.
|
||||
% The positive integer elements of the table represent child elements (either strings for direct text content or nested tables for nested elements).
|
||||
% All string members which do not start with a colon are attributes, whose value is the result of applying \texttt{tostring} to the field value.
|
||||
% This implies that these values should almost always be strings, except that the value \texttt 0 (since it never needs a unit) can often be set as a number.
|
||||
% For example the XML document
|
||||
% \begin{verbatim}
|
||||
% <math block="display">
|
||||
% <mn>0</mn>
|
||||
% <mo> < </mo>
|
||||
% <mi mathvariant="normal">x</mi>
|
||||
% </math>
|
||||
% \end{verbatim}
|
||||
% would be represented by the Lua table
|
||||
% \begin{verbatim}
|
||||
% {[0] = "math", block="display",
|
||||
% {[0] = "mn", "0"},
|
||||
% {[0] = "mo", "<"},
|
||||
% {[0] = "mi", mathvariant="normal", "x"}
|
||||
% }
|
||||
% \end{verbatim}
|
||||
%
|
||||
% \subsection{Expression cores}
|
||||
% MathML knows the concept of \enquote{embellished operators}:
|
||||
% \begin{blockquote}
|
||||
% The precise definition of an \enquote{embellished operator} is:
|
||||
% \begin{itemize}
|
||||
% \item an \xmltag{mo} element;
|
||||
% \item or one of the elements \xmltag{msub}, \xmltag{msup}, \xmltag{msubsup}, \xmltag{munder}, \xmltag{mover}, \xmltag{munderover}, \xmltag{mmultiscripts}, \xmltag{mfrac}, or \xmltag{semantics} (§ 5.1 Annotation Framework), whose first argument exists and is an embellished operator;
|
||||
% \item or one of the elements \xmltag{mstyle}, \xmltag{mphantom}, or \xmltag{mpadded}, such that an mrow containing the same arguments would be an embellished operator;
|
||||
% \item or an \xmltag{maction} element whose selected sub-expression exists and is an embellished operator;
|
||||
% \item or an \xmltag{mrow} whose arguments consist (in any order) of one embellished operator and zero or more space-like elements.
|
||||
% \end{itemize}
|
||||
% \end{blockquote}
|
||||
% For every embellished operator, MathML calls the \xmltag{mo} element defining the embellished operator the \enquote{core} of the embellished operator.
|
||||
%
|
||||
% \Luamml\ makes this slightly more general: Every expression is represented by a pair of two elements: The expression and it's core.
|
||||
% The core is always a \xmltag{mo}, \xmltag{mi}, or \xmltag{mn}, \texttt{nil} or s special marker for space like elements.
|
||||
%
|
||||
% If and only if the element is a embellished operator the core is a \xmltag{mo} element representing the core of the embellished operator.
|
||||
% The core is a \xmltag{mi} or a \xmltag{mn} element if and only if the element would be an embellished operator with this core if this element where a \xmltag{mo} element.
|
||||
% The core is the special space like marker for space like elements. Otherwise the core is \texttt{nil}.
|
||||
%
|
||||
% \subsection{Translation of math noads}
|
||||
% A math lists can contain the following node types: noad, fence, fraction, radical, accent, style, choice, ins, mark, adjust, boundary, whatsit, penalty, disc, glue, and kern. The \enquote{noads}
|
||||
%
|
||||
% \subsubsection{Translation of kernel noads}
|
||||
% The math noads of this list contain nested kernel noads. So in the first step, we look into how kernel nodes are translated to math nodes.
|
||||
%
|
||||
% \paragraph{\texttt{math_char} kernel noads}
|
||||
% First the family and character value in the \texttt{math_char} are used to lookup the Unicode character value of this \texttt{math_char}.
|
||||
% (For \texttt{unicode-math}, this is usually just the character value. Legacy maths has to be remapped based on the family.)
|
||||
% Then there are two cases: The digits \texttt{0} to \texttt{9} are mapped to \xmltag{mn} elements, everything else becomes a \xmltag{mi} element with \texttt{mathvariant} set to \texttt{normal}.
|
||||
% (The \texttt{mathvariant} value might get suppressed if the character defaults to mathvariant \texttt{normal}.)
|
||||
% In either case, the \texttt{tex:family} attribute is set to the family number if it's not \texttt{0}.
|
||||
%
|
||||
% The core is always set to the expression itself. E.g.\ the \texttt{math_char} kernel noad \verb+\fam3 a+ would become (assuming no remapping for this family)
|
||||
% \begin{verbatim}
|
||||
% {[0] = 'mi',
|
||||
% mathvariant = 'normal',
|
||||
% ["tex:family"] = 3,
|
||||
% "a"
|
||||
% }
|
||||
% \end{verbatim}
|
||||
%
|
||||
% \subsubsection{\texttt{sub_box} kernel noads}
|
||||
% I am open to suggestions how to convert them properly.
|
||||
%
|
||||
% \subsubsection{\texttt{sub_mlist} kernel noads}
|
||||
% The inner list is converted as a \xmltag{mrow} element, with the core being the core of the \xmltag{mrow} element. See the rules for this later.
|
||||
%
|
||||
% \subsubsection{\texttt{delim} kernel noads}
|
||||
% If the \texttt{small_char} is zero, these get converted as space like elements of the form
|
||||
% \begin{verbatim}
|
||||
% {[0] = 'mspace',
|
||||
% width = '1.196pt',
|
||||
% }
|
||||
% \end{verbatim}
|
||||
% where 1.196 is replaced by the current value of \verb+\nulldelimiterspace+ converted to \texttt{bp}.
|
||||
%
|
||||
% Otherwise the same rules as for \texttt{math_char} apply,
|
||||
% except that instead of \texttt{mi} or \xmltag{mn} elements,
|
||||
% \texttt{mo} elements are generated,
|
||||
% \texttt{mathvariant} is never set,
|
||||
% \texttt{stretchy} is set to \texttt{true} if the operator is not on the list of default stretchy operators in the MathML specification
|
||||
% nd \texttt{lspace} and \texttt{rspace} attributes are set to zero.
|
||||
%
|
||||
% \subsubsection{\texttt{acc} kernel noads}
|
||||
% Depending on the surrounding element containing the \texttt{acc} kernel noad, it is either stretchy or not.
|
||||
% If it's stretchy, the same rules as for \texttt{delim} apply, except that \texttt{lspace} and \texttt{rspace} are not set.
|
||||
% Otherwise the \texttt{stretchy} attribute is set to false if the operator is on the list of default stretchy operators.
|
||||
|
Loading…
Reference in New Issue
Block a user