Python
Python
Release 0.8.5
Steve Canny
Contents
What it can do
User Guide
2.1 Installing . . . . . . . . . . . . . . . . .
2.2 Quickstart . . . . . . . . . . . . . . . .
2.3 Working with Documents . . . . . . . .
2.4 Working with Text . . . . . . . . . . . .
2.5 Working with Sections . . . . . . . . . .
2.6 API basics . . . . . . . . . . . . . . . .
2.7 Understanding Styles . . . . . . . . . . .
2.8 Working with Styles . . . . . . . . . . .
2.9 Understanding pictures and other shapes
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
5
5
5
10
12
16
18
19
26
32
API Documentation
3.1 Document objects .
3.2 Style-related objects
3.3 Text-related objects .
3.4 Table objects . . . .
3.5 Section objects . . .
3.6 Shape-related objects
3.7 DrawingML objects
3.8 Shared classes . . .
3.9 Enumerations . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
33
33
35
41
46
48
49
50
51
52
Contributor Guide
4.1 Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
61
61
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
ii
Contents
Contents
CHAPTER 1
What it can do
document.add_paragraph(
'first item in unordered list', style='ListBull
)
document.add_paragraph(
'first item in ordered list', style='ListNumber
)
document.add_picture('monty-truth.png', width=Inche
CHAPTER 2
User Guide
2.1 Installing
Note: python-docx versions 0.3.0 and later are not API-compatible with prior versions.
python-docx is hosted on PyPI, so installation is relatively simple, and just depends on what installation utilities
you have installed.
python-docx may be installed with pip if you have it available:
pip install python-docx
If neither pip nor easy_install is available, it can be installed manually by downloading the distribution from
PyPI, unpacking the tarball, and running setup.py:
tar xvzf python-docx-{version}.tar.gz
cd python-docx-{version}
python setup.py install
python-docx depends on the lxml package. Both pip and easy_install will take care of satisfying those
dependencies for you, but if you use this last method you will need to install those yourself.
2.1.1 Dependencies
Python 2.6, 2.7, 3.3, or 3.4
lxml >= 2.3.2
2.2 Quickstart
Getting started with python-docx is easy. Lets walk through the basics.
This opens up a blank document based on the default template, pretty much what you get when you start a new document in Word using the built-in defaults. You can open and work on an existing Word document using python-docx,
but well keep things simple for the moment.
This method returns a reference to a paragraph, newly added paragraph at the end of the document. The new paragraph
reference is assigned to paragraph in this case, but Ill be leaving that out in the following examples unless I have
a need for it. In your code, often times you wont be doing anything with the item after youve added it, so theres not
a lot of sense in keep a reference to it hanging around.
Its also possible to use one paragraph as a cursor and insert a new paragraph directly above it:
prior_paragraph = paragraph.insert_paragraph_before('Lorem ipsum')
This allows a paragraph to be inserted in the middle of a document, something thats often important when modifying
an existing document rather than generating one from scratch.
By default, this adds a top-level heading, what appears in Word as Heading 1. When you want a heading for a
sub-section, just specify the level you want as an integer between 1 and 9:
document.add_heading('The role of dolphins', level=2)
If you specify a level of 0, a Title paragraph is added. This can be handy to start a relatively short document that
doesnt have a separate title page.
If you find yourself using this very often, its probably a sign you could benefit by better understanding paragraph
styles. One paragraph style property you can set is to break a page immediately before each paragraph having that
style. So you might set your headings of a certain level to always start a new page. More on styles later. They turn out
to be critically important for really getting the most out of Word.
Tables have several properties and methods youll need in order to populate them. Accessing individual cells is
probably a good place to start. As a baseline, you can always access a cell by its row and column indicies:
cell = table.cell(0, 1)
This gives you the right-hand cell in the top row of the table we just created. Note that row and column indicies are
zero-based, just like in list access.
Once you have a cell, you can put something in it:
cell.text = 'parrot, possibly dead'
Frequently its easier to access a row of cells at a time, for example when populating a table of variable length from a
data source. The .rows property of a table provides access to individual rows, each of which has a .cells property.
The .cells property on both Row and Column supports indexed access, like a list:
row = table.rows[1]
row.cells[0].text = 'Foo bar to you.'
row.cells[1].text = 'And a hearty foo bar to you too sir!'
The .rows and .columns collections on a table are iterable, so you can use them directly in a for loop. Same with
the .cells sequences on a row or column:
for row in table.rows:
for cell in row.cells:
print(cell.text)
If you want a count of the rows or columns in the table, just use len() on the sequence:
row_count = len(table.rows)
col_count = len(table.columns)
This can be very handy for the variable length table scenario we mentioned above:
# get table data ------------items = get_things_from_database_or_something()
# add table -----------------table = document.add_table(1, 3)
# populate header row -------heading_cells = table.rows[0].cells
heading_cells[0].text = 'Qty'
2.2. Quickstart
heading_cells[1].text = 'SKU'
heading_cells[2].text = 'Description'
# add a data row for each item
for item in items:
cells = table.add_row().cells
cells[0].text = str(item.qty)
cells[1].text = item.sku
cells[2].text = item.desc
The same works for columns, although Ive yet to see a use case for it.
Word has a set of pre-formatted table styles you can pick from its table style gallery. You can apply one of those to the
table like this:
table.style = 'LightShading-Accent1'
The style name is formed by removing all the spaces from the table style name. You can find the table style name by
hovering your mouse over its thumbnail in Words table style gallery.
This example uses a path, which loads the image file from the local filesystem. You can also use a file-like object,
essentially any object that acts like an open file. This might be handy if youre retrieving your image from a database
or over a network and dont want to get the filesystem involved.
Image size
By default, the added image appears at native size. This is often bigger than you want. Native size is calculated as
pixels / dpi. So a 300x300 pixel image having 300 dpi resolution appears in a one inch square. The problem
is most images dont contain a dpi property and it defaults to 72 dpi. This would make the same image appear 4.167
inches on a side, somewhere around half the page.
To get the image the size you want, you can specify either its width or height in convenient units, like inches or
centimeters:
from docx.shared import Inches
document.add_picture('image-filename.png', width=Inches(1.0))
Youre free to specify both width and height, but usually you wouldnt want to. If you specify only one,
python-docx uses it to calculate the properly scaled value of the other. This way the aspect ratio is preserved
and your picture doesnt look stretched.
The Inches and Cm classes are provided to let you specify measurements in handy units. Internally, python-docx
uses English Metric Units, 914400 to the inch. So if you forget and just put something like width=2 youll get an
extremely small image :). Youll need to import them from the docx.shared sub-package. You can use them in
arithmetic just like they were an integer, which in fact they are. So an expression like width = Inches(3) /
thing_count works just fine.
This particular style causes the paragraph to appear as a bullet, a very handy thing. You can also apply a style afterward.
These two lines are equivalent to the one above:
paragraph = document.add_paragraph('Lorem ipsum dolor sit amet.')
paragraph.style = 'ListBullet'
The style is specified using its style ID, ListBullet in this example. Generally, the style ID is formed by removing the
spaces in the style name as it appears in the Word user interface (UI). So the style List Number 3 would be specified
as ListNumber3. However, note that if you are using a localized version of Word, the style ID may be derived
from the English style name and may not correspond so neatly to its style name in the Word UI.
This produces a paragraph that looks just like one created from a single string. Its not apparent where paragraph text
is broken into runs unless you look at the XML. Note the trailing space at the end of the first string. You need to be
explicit about where spaces appear at the beginning and end of a run. Theyre not automatically inserted between runs.
Expect to be caught by that one a few times :).
Run objects have both a .bold and .italic property that allows you to set their value for a run:
paragraph = document.add_paragraph('Lorem ipsum ')
run = paragraph.add_run('dolor')
run.bold = True
paragraph.add_run(' sit amet.')
which produces text that looks like this: Lorem ipsum dolor sit amet.
Note that you can set bold or italic right on the result of .add_run() if you dont need it for anything else:
paragraph.add_run('dolor').bold = True
# is equivalent to:
run = paragraph.add_run('dolor')
run.bold = True
2.2. Quickstart
Its not necessary to provide text to the .add_paragraph() method. This can make your code simpler if youre
building the paragraph up from runs anyway:
paragraph = document.add_paragraph()
paragraph.add_run('Lorem ipsum ')
paragraph.add_run('dolor').bold = True
paragraph.add_run(' sit amet.')
You can also apply a style to a run after it is created. This code produces the same result as the lines above:
paragraph = document.add_paragraph('Normal text, ')
run = paragraph.add_run('text with emphasis.')
run.style = 'Emphasis'
As with a paragraph style, the style ID is formed by removing the spaces in the name as it appears in the Word UI.
So the style Subtle Emphasis would be specified as SubtleEmphasis. Note that if you are using a localized
version of Word, the style ID may be derived from the English style name and may not correspond to its style name in
the Word UI.
10
This creates a new document from the built-in default template and saves it unchanged to a file named test.docx. The
so-called default template is actually just a Word file having no content, stored with the installed python-docx
package. Its roughly the same as you get by picking the Word Document template after selecting Words File > New
from Template... menu item.
Things to note:
You can open any Word 2007 or later file this way (.doc files from Word 2003 and earlier wont work). While
you might not be able to manipulate all the contents yet, whatever is already in there will load and save just
fine. The feature set is still being built out, so you cant add or change things like headers or footnotes yet, but
if the document has them python-docx is polite enough to leave them alone and smart enough to save them
without actually understanding what they are.
If you use the same filename to open and save the file, python-docx will obediently overwrite the original
file without a peep. Youll want to make sure thats what you intend.
Okay, so youve got a document open and are pretty sure you can save it somewhere later. Next step is to get some
content in there ...
11
>>> paragraph_format.alignment
None # indicating alignment is inherited from the style hierarchy
>>> paragraph_format.alignment = WD_ALIGN_PARAGRAPH.CENTER
>>> paragraph_format.alignment
CENTER (1)
12
Indentation
Indentation is the horizontal space between a paragraph and edge of its container, typically the page margin. A
paragraph can be indented separately on the left and right side. The first line can also have a different indentation than
the rest of the paragraph. A first line indented further than the rest of the paragraph has first line indent. A first line
indented less has a hanging indent.
Indentation is specified using a Length value, such as Inches, Pt, or Cm. Negative values are valid and cause the
paragraph to overlap the margin by the specified amount. A value of None indicates the indentation value is inherited
from the style hierarchy. Assigning None to an indentation property removes any directly-applied indentation setting
and restores inheritance from the style hierarchy:
>>> from docx.shared import Inches
>>> paragraph = document.add_paragraph()
>>> paragraph_format = paragraph.paragraph_format
>>> paragraph.left_indent
None # indicating indentation is inherited from the style hierarchy
>>> paragraph.left_indent = Inches(0.5)
>>> paragraph.left_indent
457200
>>> paragraph.left_indent.inches
0.5
First-line indent is specified using the first_line_indent property and is interpreted relative to the left indent.
A negative value indicates a hanging indent:
>>> paragraph.first_line_indent
None
>>> paragraph.first_line_indent = Inches(-0.25)
>>> paragraph.first_line_indent
-228600
>>> paragraph.first_line_indent.inches
-0.25
Paragraph spacing
The space_before and space_after properties control the spacing between subsequent paragraphs, controlling
the spacing before and after a paragraph, respectively. Inter-paragraph spacing is collapsed during page layout, meaning the spacing between two paragraphs is the maximum of the space_after for the first paragraph and the space_before
of the second paragraph. Paragraph spacing is specified as a Length value, often using Pt:
>>> paragraph_format.space_before, paragraph_format.space_after
(None, None) # inherited by default
>>> paragraph_format.space_before = Pt(18)
>>> paragraph_format.space_before.pt
13
18.0
>>> paragraph_format.space_after = Pt(12)
>>> paragraph_format.space_after.pt
12.0
Line spacing
Line spacing is the distance between subsequent baselines in the lines of a paragraph. Line spacing can be specified
either as an absolute distance or relative to the line height (essentially the point size of the font used). A typical
absolute measure would be 18 points. A typical relative measure would be double-spaced (2.0 line heights). The
default line spacing is single-spaced (1.0 line heights).
Line spacing is controlled by the interaction of the line_spacing and line_spacing_rule properties.
line_spacing is either a Length value, a (small-ish) float, or None. A Length value indicates an
absolute distance. A float indicates a number of line heights. None indicates line spacing is inherited.
line_spacing_rule is a member of the WD_LINE_SPACING enumeration or None:
>>> from docx.shared import Length
>>> paragraph_format.line_spacing
None
>>> paragraph_format.line_spacing_rule
None
>>> paragraph_format.line_spacing = Pt(18)
>>> isinstance(Length, paragraph_format.line_spacing)
True
>>> paragraph_format.line_spacing.pt
18.0
>>> paragraph_format.line_spacing_rule
EXACTLY (4)
>>> paragraph_format.line_spacing = 1.75
>>> paragraph_format.line_spacing
1.75
>>> paragraph_format.line_spacing_rule
MULTIPLE (5)
Pagination properties
Four paragraph properties, keep_together, keep_with_next, page_break_before,
widow_control control aspects of how the paragraph behaves near page boundaries.
and
keep_together causes the entire paragraph to appear on the same page, issuing a page break before the paragraph
if it would otherwise be broken across two pages.
keep_with_next keeps a paragraph on the same page as the subsequent paragraph. This can be used, for example,
to keep a section heading on the same page as the first paragraph of the section.
page_break_before causes a paragraph to be placed at the top of a new page. This could be used on a chapter
heading to ensure chapters start on a new page.
widow_control breaks a page to avoid placing the first or last line of the paragraph on a separate page from the
rest of the paragraph.
All four of these properties are tri-state, meaning they can take the value True, False, or None. None indicates
the property value is inherited from the style hierarchy. True means on and False means off:
14
>>> paragraph_format.keep_together
None # all four inherit by default
>>> paragraph_format.keep_with_next = True
>>> paragraph_format.keep_with_next
True
>>> paragraph_format.page_break_before = False
>>> paragraph_format.page_break_before
False
Many font properties are tri-state, meaning they can take the values True, False, and None. True means the
property is on, False means it is off. Conceptually, the None value means inherit. A run exists in the style
inheritance hierarchy and by default inherits its character formatting from that hierarchy. Any character formatting
directly applied using the Font object overrides the inherited values.
Bold and italic are tri-state properties, as are all-caps, strikethrough, superscript, and many others. See the Font API
documentation for a full list:
>>> font.bold, font.italic
(None, None)
>>> font.italic = True
>>> font.italic
True
>>> font.italic = False
>>> font.italic
False
>>> font.italic = None
>>> font.italic
None
Underline is a bit of a special case. It is a hybrid of a tri-state property and an enumerated value property. True means
single underline, by far the most common. False means no underline, but more often None is the right choice if no
underlining is wanted. The other forms of underlining, such as double or dashed, are specified with a member of the
WD_UNDERLINE enumeration:
>>> font.underline
None
>>> font.underline = True
15
>>> # or perhaps
>>> font.underline = WD_UNDERLINE.DOT_DASH
Font color
Each Font object has a ColorFormat object that provides access to its color, accessed via its read-only color
property.
Apply a specific RGB color to a font:
>>> from docx.shared import RGBColor
>>> font.color.rgb = RGBColor(0x42, 0x24, 0xE9)
A font can also be set to a theme color by assigning a member of the MSO_THEME_COLOR_INDEX enumeration:
>>> from docx.enum.dml import MSO_THEME_COLOR
>>> font.color.theme_color = MSO_THEME_COLOR.ACCENT_1
A fonts color can be restored to its default (inherited) value by assigning None to either the rgb or theme_color
attribute of ColorFormat:
>>> font.color.rgb = None
Determining the color of a font begins with determining its color type:
>>> font.color.type
RGB (1)
The value of the type property can be a member of the MSO_COLOR_TYPE enumeration or None.
MSO_COLOR_TYPE.RGB indicates it is an RGB color. MSO_COLOR_TYPE.THEME indicates a theme color.
MSO_COLOR_TYPE.AUTO indicates its value is determined automatically by the application, usually set to black.
(This value is relatively rare.) None indicates no color is applied and the color is inherited from the style hierarchy;
this is the most common case.
When the color type is MSO_COLOR_TYPE.RGB, the rgb property will be an RGBColor value indicating the RGB
color:
>>> font.color.rgb
RGBColor(0x42, 0x24, 0xe9)
When the color type is MSO_COLOR_TYPE.THEME, the theme_color property will be a member of
MSO_THEME_COLOR_INDEX indicating the theme color:
>>> font.color.theme_color
ACCENT_1 (5)
16
Its theoretically possible for a document not to have any explicit sections, although Ive yet to see this occur in the
wild. If youre accessing an unpredictable population of .docx files you may want to provide for that possibility using
a len() check or try block to avoid an uncaught IndexError exception stopping your program.
17
Page margins
Seven properties on Section together specify the various edge spacings that determine where text appears on the
page:
>>> from docx.shared import Inches
>>> section.left_margin, section.right_margin
(1143000, 1143000) # (Inches(1.25), Inches(1.25))
>>> section.top_margin, section.bottom_margin
(914400, 914400) # (Inches(1), Inches(1))
>>> section.gutter
0
>>> section.header_distance, section.footer_distance
(457200, 457200) # (Inches(0.5), Inches(0.5))
>>> section.left_margin = Inches(1.5)
>>> section.right_margin = Inches(1)
>>> section.left_margin, section.right_margin
(1371600, 914400)
18
19
2.7.3 Glossary
style definition A <w:style> element in the styles part of a document that explicitly defines the attributes of a
style.
defined style A style that is explicitly defined in a document. Contrast with latent style.
built-in style One of the set of 276 pre-set styles built into Word, such as Heading 1. A built-in style can be either
defined or latent. A built-in style that is not yet defined is known as a latent style. Both defined and latent built-in
styles may appear as options in Words style panel and style gallery.
custom style Also known as a user defined style, any style defined in a Word document that is not a built-in style.
Note that a custom style cannot be a latent style.
latent style A built-in style having no definition in a particular document is known as a latent style in that document.
A latent style can appear as an option in the Word UI depending on the settings in the LatentStyles object
for the document.
recommended style list A list of styles that appears in the styles toolbox or panel when Recommended is selected
from the List: dropdown box.
Style Gallery The selection of example styles that appear in the ribbon of the Word UI and which may be applied by
clicking on one of them.
20
In brief, a style appears in the recommended list if its hidden property is False (the default). If a style is
not hidden and its quick_style property is True, it also appears in the style gallery. If a hidden styles
unhide_when_used property is True, its hidden property is set False the first time it is used. Styles in the
style lists and style gallery are sorted in priority order, then alphabetically for styles of the same priority. If a
styles locked property is True and formatting restrictions are turned on for the document, the style will not appear
in any list or the style gallery and cannot be applied to content.
21
List 3
List Bullet
List Bullet 2
List Bullet 3
List Continue
List Continue 2
List Continue 3
List Number
List Number 2
List Number 3
List Paragraph
Macro Text
No Spacing
Quote
Subtitle
TOCHeading
Title
22
Intense Reference
Macro Text Char
Quote Char
Strong
Subtitle Char
Subtle Emphasis
Subtle Reference
Title Char
23
24
25
Note: Built-in styles are stored in a WordprocessingML file using their English name, e.g. Heading 1, even though
users working on a localized version of Word will see native language names in the UI, e.g. Kop 1. Because
python-docx operates on the WordprocessingML file, style lookups must use the English name. A document
available on this external site allows you to create a mapping between local language names and English style names:
http://www.thedoctools.com/index.php?show=mt_create_style_name_list
User-defined styles, also known as custom styles, are not localized and are accessed with the name exactly as it appears
in the Word UI.
The Styles object is also iterable. By using the identification properties on BaseStyle, various subsets of the
defined styles can be generated. For example, this code will produce a list of the defined paragraph styles:
>>> from docx.enum.style import WD_STYLE_TYPE
>>> styles = document.styles
>>> paragraph_styles = [
...
s for s in styles if s.type == WD_STYLE_TYPE.PARAGRAPH
... ]
>>> for style in paragraph_styles:
...
print(style.name)
...
Normal
Body Text
List Bullet
26
A style name can also be assigned directly, in which case python-docx will do the lookup for you:
>>> paragraph.style = 'List Bullet'
>>> paragraph.style
<docx.styles.style._ParagraphStyle object at <0x10a7c4f84>
>>> paragraph.style.name
'List Bullet'
A style can also be applied at creation time using either the style object or its name:
>>> paragraph = document.add_paragraph(style='Body Text')
>>> paragraph.style.name
'Body Text'
>>> body_text_style = document.styles['Body Text']
>>> paragraph = document.add_paragraph(style=body_text_style)
>>> paragraph.style.name
'Body Text'
Use the base_style property to specify a style the new style should inherit formatting settings from:
>>> style.base_style
None
>>> style.base_style = styles['Normal']
>>> style.base_style
<docx.styles.style._ParagraphStyle object at 0x10a7a9550>
>>> style.base_style.name
'Normal'
A style can be removed from the document simply by calling its delete() method:
>>>
>>>
10
>>>
>>>
9
styles = document.styles
len(styles)
styles['Citation'].delete()
len(styles)
27
Note: The Style.delete() method removes the styles definition from the document. It does not affect content
in the document to which that style is applied. Content having a style not defined in the document is rendered using
the default style for that content object, e.g. Normal in the case of a paragraph.
Many font properties are tri-state, meaning they can take the values True, False, and None. True means the
property is on, False means it is off. Conceptually, the None value means inherit. Because a style exists
in an inheritance hierarchy, it is important to have the ability to specify a property at the right place in the hierarchy,
generally as far up the hierarchy as possible. For example, if all headings should be in the Arial typeface, it makes
more sense to set that property on the Heading 1 style and have Heading 2 inherit from Heading 1.
Bold and italic are tri-state properties, as are all-caps, strikethrough, superscript, and many others. See the Font API
documentation for a full list:
>>> font.bold, font.italic
(None, None)
>>> font.italic = True
>>> font.italic
True
>>> font.italic = False
>>> font.italic
False
>>> font.italic = None
>>> font.italic
None
Underline is a bit of a special case. It is a hybrid of a tri-state property and an enumerated value property. True means
single underline, by far the most common. False means no underline, but more often None is the right choice if no
underlining is wanted since it is rare to inherit it from a base style. The other forms of underlining, such as double or
dashed, are specified with a member of the WD_UNDERLINE enumeration:
>>> font.underline
None
>>> font.underline = True
28
>>> # or perhaps
>>> font.underline = WD_UNDERLINE.DOT_DASH
The default behavior can be restored by assigning None or the style itself:
>>> heading_1_style = styles['Heading 1']
>>> heading_1_style.next_paragraph_style.name
'Body Text'
>>> heading_1_style.next_paragraph_style = heading_1_style
>>> heading_1_style.next_paragraph_style.name
'Heading 1'
>>> heading_1_style.next_paragraph_style = None
>>> heading_1_style.next_paragraph_style.name
'Heading 1'
29
30
A LatentStyles object supports len(), iteration, and dictionary-style access by style name:
>>> len(latent_styles)
161
>>> latent_style_names = [ls.name for ls in latent_styles]
>>> latent_style_names
['Normal', 'Heading 1', 'Heading 2', ... 'TOC Heading']
>>> latent_quote = latent_styles['Quote']
>>> latent_quote
<docx.styles.latent.LatentStyle object at 0x10a7c4f50>
>>> latent_quote.priority
29
31
32
CHAPTER 3
API Documentation
33
add_section(start_type=2)
Return a Section object representing a new section added at the end of the document. The optional start_type argument must be a member of the WD_SECTION_START enumeration, and defaults
to WD_SECTION.NEW_PAGE if not provided.
add_table(rows, cols, style=None)
Add a table having row and column counts of rows and cols respectively and table style of style. style may
be a paragraph style object or a paragraph style name. If style is None, the table inherits the default table
style of the document.
core_properties
A CoreProperties object providing read/write access to the core properties of this document.
inline_shapes
An InlineShapes object providing access to the inline shapes in this document. An inline shape is a
graphical object, such as a picture, contained in a run of text and behaving like a character glyph, being
flowed like other text in a paragraph.
paragraphs
A list of Paragraph instances corresponding to the paragraphs in the document, in document order.
Note that paragraphs within revision marks such as <w:ins> or <w:del> do not appear in this list.
part
The DocumentPart object of this document.
save(path_or_stream)
Save this document to path_or_stream, which can be either a path to a filesystem location (a string) or a
file-like object.
sections
A Sections object providing access to each section in this document.
styles
A Styles object providing access to the styles in this document.
tables
A list of Table instances corresponding to the tables in the document, in document order. Note that only
tables appearing at the top level of the document appear in this list; a table nested inside a table cell does
not appear. A table within revision marks such as <w:ins> or <w:del> will also not appear in the list.
34
author
string An entity primarily responsible for making the content of the resource.
category
string A categorization of the content of this package. Example values might include: Resume, Letter,
Financial Forecast, Proposal, or Technical Presentation.
comments
string An account of the content of the resource.
content_status
string completion status of the document, e.g. draft
created
datetime time of intial creation of the document
identifier
string An unambiguous reference to the resource within a given context, e.g. ISBN.
keywords
string descriptive words or short phrases likely to be used as search terms for this document
language
string language the document is written in
last_modified_by
string name or other identifier (such as email address) of person who last modified the document
last_printed
datetime time the document was last printed
modified
datetime time the document was last modified
revision
int number of this revision, incremented by Word each time the document is saved. Note however
python-docx does not automatically increment the revision number when it saves a document.
subject
string The topic of the content of the resource.
title
string The name given to the resource.
version
string free-form version string
35
corresponding
to
the
type
of
this
style,
e.g.
unhide_when_used
True if an application should make this style visible the next time it is applied to content. False otherwise.
36
Note that python-docx does not automatically unhide a style having True for this attribute when it is
applied to content.
37
A paragraph style. A paragraph style provides both character formatting and paragraph formatting such as
indentation and line-spacing.
base_style
Style object this style inherits from or None if this style is not based on another style.
builtin
Read-only. True if this style is a built-in style. False indicates it is a custom (user-defined) style. Note
this value is based on the presence of a customStyle attribute in the XML, not on specific knowledge of
which styles are built into Word.
delete()
Remove this style definition from the document. Note that calling this method does not remove or change
the style applied to any document content. Content items having the deleted style will be rendered using
the default style, as is any content with a style not defined in the document.
font
The Font object providing access to the character formatting properties for this style, such as font name
and size.
hidden
True if display of this style in the style gallery and list of recommended styles is suppressed. False
otherwise. In order to be shown in the style gallery, this value must be False and quick_style must
be True.
locked
Read/write Boolean. True if this style is locked. A locked style does not appear in the styles panel or
the style gallery and cannot be applied to document content. This behavior is only active when formatting
protection is turned on for the document (via the Developer menu).
name
The UI name of this style.
next_paragraph_style
_ParagraphStyle object representing the style to be applied automatically to a new paragraph inserted
after a paragraph of this style. Returns self if no next paragraph style is defined. Assigning None or self
removes the setting such that new paragraphs are created using this same style.
paragraph_format
The ParagraphFormat object providing access to the paragraph formatting properties for this style
such as indentation.
priority
The integer sort key governing display sequence of this style in the Word UI. None indicates no setting is
defined, causing Word to use the default value of 0. Style name is used as a secondary sort key to resolve
ordering of styles having the same priority value.
quick_style
True if this style should be displayed in the style gallery when hidden is False. Read/write Boolean.
unhide_when_used
True if an application should make this style visible the next time it is applied to content. False otherwise.
Note that python-docx does not automatically unhide a style having True for this attribute when it is
applied to content.
38
A table style. A table style provides character and paragraph formatting for its contents as well as special table
formatting properties.
base_style
Style object this style inherits from or None if this style is not based on another style.
builtin
Read-only. True if this style is a built-in style. False indicates it is a custom (user-defined) style. Note
this value is based on the presence of a customStyle attribute in the XML, not on specific knowledge of
which styles are built into Word.
delete()
Remove this style definition from the document. Note that calling this method does not remove or change
the style applied to any document content. Content items having the deleted style will be rendered using
the default style, as is any content with a style not defined in the document.
font
The Font object providing access to the character formatting properties for this style, such as font name
and size.
hidden
True if display of this style in the style gallery and list of recommended styles is suppressed. False
otherwise. In order to be shown in the style gallery, this value must be False and quick_style must
be True.
locked
Read/write Boolean. True if this style is locked. A locked style does not appear in the styles panel or
the style gallery and cannot be applied to document content. This behavior is only active when formatting
protection is turned on for the document (via the Developer menu).
name
The UI name of this style.
next_paragraph_style
_ParagraphStyle object representing the style to be applied automatically to a new paragraph inserted
after a paragraph of this style. Returns self if no next paragraph style is defined. Assigning None or self
removes the setting such that new paragraphs are created using this same style.
paragraph_format
The ParagraphFormat object providing access to the paragraph formatting properties for this style
such as indentation.
priority
The integer sort key governing display sequence of this style in the Word UI. None indicates no setting is
defined, causing Word to use the default value of 0. Style name is used as a secondary sort key to resolve
ordering of styles having the same priority value.
quick_style
True if this style should be displayed in the style gallery when hidden is False. Read/write Boolean.
unhide_when_used
True if an application should make this style visible the next time it is applied to content. False otherwise.
Note that python-docx does not automatically unhide a style having True for this attribute when it is
applied to content.
39
40
locked
Tri-state value specifying whether this latent styles is locked. A locked style does not appear in the styles
panel or the style gallery and cannot be applied to document content. This behavior is only active when
formatting protection is turned on for the document (via the Developer menu).
name
The name of the built-in style this exception applies to.
priority
The integer sort key for this latent style in the Word UI.
quick_style
Tri-state value specifying whether this latent style should appear in the Word styles gallery when not
hidden. None indicates the effective value should be inherited from the default values in its parent
LatentStyles object.
unhide_when_used
Tri-state value specifying whether this style should have its hidden attribute set False the next time
the style is applied to content. None indicates the effective value should be inherited from the default
specified by its parent LatentStyles object.
41
style name can be assigned in lieu of a paragraph style object. Assigning None removes any applied style,
making its effective value the default paragraph style for the document.
text
String formed by concatenating the text of each run in the paragraph. Tabs and line breaks in the XML are
mapped to \t and \n characters respectively.
Assigning text to this property causes all existing paragraph content to be replaced with a single run
containing the assigned text. A \t character in the text is mapped to a <w:tab/> element and each \n
or \r character is mapped to a line break. Paragraph-level formatting, such as style, is preserved. All
run-level formatting, such as bold or italic, is removed.
42
right_indent
Length value specifying the space between the right margin and the right side of the paragraph. None
indicates the right indent value is inherited from the style hierarchy. Use a Cm value object as a convenient
way to apply indentation in units of centimeters.
space_after
Length value specifying the spacing to appear between this paragraph and the subsequent paragraph.
None indicates this value is inherited from the style hierarchy. Length objects provide convenience
properties, such as pt and inches, that allow easy conversion to various length units.
space_before
Length value specifying the spacing to appear between this paragraph and the prior paragraph. None
indicates this value is inherited from the style hierarchy. Length objects provide convenience properties,
such as pt and cm, that allow easy conversion to various length units.
widow_control
True if the first and last lines in the paragraph remain on the same page as the rest of the paragraph when
Word repaginates the document. None indicates its effective value is inherited from the style hierarchy.
43
style
Read/write. A _CharacterStyle object representing the character style applied to this run. The default
character style for the document (often Default Character Font) is returned if the run has no directlyapplied character style. Setting this property to None removes any directly-applied character style.
text
String formed by concatenating the text equivalent of each run content child element into a Python string.
Each <w:t> element adds the text characters it contains. A <w:tab/> element adds a \t character. A
<w:cr/> or <w:br> element each add a \n character. Note that a <w:br> element can indicate a page
break or column break as well as a line break. All <w:br> elements translate to a single \n character
regardless of their type. All other content child elements, such as <w:drawing>, are ignored.
Assigning text to this property has the reverse effect, translating each \t character to a <w:tab/> element
and each \n or \r character to a <w:cr/> element. Any existing run content is replaced. Run formatting
is preserved.
underline
The underline style for this Run, one of None, True, False, or a value from WD_UNDERLINE. A value
of None indicates the run has no directly-applied underline value and so will inherit the underline value of
its containing paragraph. Assigning None to this property removes any directly-applied underline value.
A value of False indicates a directly-applied setting of no underline, overriding any inherited value. A
value of True indicates single underline. The values from WD_UNDERLINE are used to specify other
outline styles such as double, wavy, and dotted.
44
hidden
Read/write tri-state value. When True, causes the text in the run to be hidden from display, unless
applications settings force hidden text to be shown.
imprint
Read/write tri-state value. When True, causes the text in the run to appear as if pressed into the page.
italic
Read/write tri-state value. When True, causes the text of the run to appear in italics. None indicates the
effective value is inherited from the style hierarchy.
math
Read/write tri-state value. When True, specifies this run contains WML that should be handled as though
it was Office Open XML Math.
name
Get or set the typeface name for this Font instance, causing the text it controls to appear in the named
font, if a matching font is found. None indicates the typeface is inherited from the style hierarchy.
no_proof
Read/write tri-state value. When True, specifies that the contents of this run should not report any errors
when the document is scanned for spelling and grammar.
outline
Read/write tri-state value. When True causes the characters in the run to appear as if they have an outline,
by drawing a one pixel wide border around the inside and outside borders of each character glyph.
rtl
Read/write tri-state value. When True causes the text in the run to have right-to-left characteristics.
shadow
Read/write tri-state value. When True causes the text in the run to appear as if each character has a
shadow.
size
Read/write Length value or None, indicating the font height in English Metric Units (EMU). None
indicates the font size should be inherited from the style hierarchy. Length is a subclass of int having
properties for convenient conversion into points or other length units. The docx.shared.Pt class
allows convenient specification of point values:
>> font.size = Pt(24)
>> font.size
304800
>> font.size.pt
24.0
small_caps
Read/write tri-state value. When True causes the lowercase characters in the run to appear as capital
letters two points smaller than the font size specified for the run.
snap_to_grid
Read/write tri-state value. When True causes the run to use the document grid characters per line settings
defined in the docGrid element when laying out the characters in this run.
spec_vanish
Read/write tri-state value. When True, specifies that the given run shall always behave as if it is hidden, even when hidden text is being displayed in the current document. The property has a very narrow,
specialized use related to the table of contents. Consult the spec (17.3.2.36) for more details.
strike
Read/write tri-state value. When True causes the text in the run to appear with a single horizontal line
45
46
rows
_Rows instance containing the sequence of rows in this table.
style
Read/write. A _TableStyle object representing the style applied to this table. The default table style
for the document (often Normal Table) is returned if the table has no directly-applied style. Assigning
None to this property removes any directly-applied table style causing it to inherit the default table style
of the document. Note that the style name of a table style differs slightly from that displayed in the user
interface; a hyphen, if it appears, must be removed. For example, Light Shading - Accent 1 becomes Light
Shading Accent 1.
table_direction
A member of WD_TABLE_DIRECTION indicating the direction in which the table cells are ordered, e.g.
WD_TABLE_DIRECTION.LTR. None indicates the value is inherited from the style hierarchy.
47
cells
Sequence of _Cell instances corresponding to cells in this row.
table
Reference to the Table object this row belongs to.
48
49
class docx.shape.InlineShape(inline)
Proxy for an <wp:inline> element, representing the container for an inline graphical object.
height
Read/write. The display height of this inline shape as an Emu instance.
type
The type of this inline shape as a member of docx.enum.shape.WD_INLINE_SHAPE, e.g.
LINKED_PICTURE. Read-only.
width
Read/write. The display width of this inline shape as an Emu instance.
50
type
Read-only. A member of MSO_COLOR_TYPE, one of RGB, THEME, or AUTO, corresponding to the
way this color is defined. Its value is None if no color is applied at this level, which causes the effective
color to be inherited from the style hierarchy.
Length objects are constructed using a selection of convenience constructors, allowing values to be expressed in the
units most appropriate to the context.
class docx.shared.Length
Base class for length constructor classes Inches, Cm, Mm, Px, and Emu. Behaves as an int count of English
Metric Units, 914,400 to the inch, 36,000 to the mm. Provides convenience unit conversion methods in the form
of read-only properties. Immutable.
cm
The equivalent length expressed in centimeters (float).
emu
The equivalent length expressed in English Metric Units (int).
inches
The equivalent length expressed in inches (float).
mm
The equivalent length expressed in millimeters (float).
pt
Floating point length in points
twips
The equivalent length expressed in twips (int).
class docx.shared.Inches
Convenience constructor for length in inches, e.g. width = Inches(0.5).
class docx.shared.Cm
Convenience constructor for length in centimeters, e.g. height = Cm(12).
class docx.shared.Mm
Convenience constructor for length in millimeters, e.g. width = Mm(240.5).
class docx.shared.Pt
Convenience value class for specifying a length in points
class docx.shared.Twips
Convenience constructor for length in twips, e.g. width = Twips(42). A twip is a twentieth of a point,
635 EMU.
51
class docx.shared.Emu
Convenience constructor for length in English Metric Units, e.g. width = Emu(457200).
classmethod from_string(rgb_hex_str)
Return a new instance from an RGB color hex string like 3C2F80.
3.9 Enumerations
Documentation for the various enumerations used for python-docx property settings can be found here:
3.9.1 MSO_COLOR_TYPE
Specifies the color specification scheme
Example:
from docx.enum.dml import MSO_COLOR_TYPE
assert font.color.type == MSO_COLOR_TYPE.THEME
3.9.2 MSO_THEME_COLOR_INDEX
Indicates the Office theme color, one of those shown in the color gallery on the formatting ribbon.
Alias: MSO_THEME_COLOR
Example:
from docx.enum.dml import MSO_THEME_COLOR
font.color.theme_color = MSO_THEME_COLOR.ACCENT_1
3.9.3 WD_PARAGRAPH_ALIGNMENT
alias: WD_ALIGN_PARAGRAPH
Specifies paragraph justification type.
Example:
from docx.enum.text import WD_ALIGN_PARAGRAPH
paragraph = document.add_paragraph()
paragraph.alignment = WD_ALIGN_PARAGRAPH.CENTER
LEFT Left-aligned
CENTER Center-aligned.
RIGHT Right-aligned.
JUSTIFY Fully justified.
DISTRIBUTE Paragraph characters are distributed to fill the entire width of the paragraph.
JUSTIFY_MED Justified with a medium character compression ratio.
JUSTIFY_HI Justified with a high character compression ratio.
JUSTIFY_LOW Justified with a low character compression ratio.
THAI_JUSTIFY Justified according to Thai formatting layout.
3.9. Enumerations
53
3.9.4 WD_BUILTIN_STYLE
alias: WD_STYLE
Specifies a built-in Microsoft Word style.
Example:
from docx import Document
from docx.enum.style import WD_STYLE
document = Document()
styles = document.styles
style = styles[WD_STYLE.BODY_TEXT]
54
HEADING_2 Heading 2.
HEADING_3 Heading 3.
HEADING_4 Heading 4.
HEADING_5 Heading 5.
HEADING_6 Heading 6.
HEADING_7 Heading 7.
HEADING_8 Heading 8.
HEADING_9 Heading 9.
HTML_ACRONYM HTML Acronym.
HTML_ADDRESS HTML Address.
HTML_CITE HTML Cite.
HTML_CODE HTML Code.
HTML_DFN HTML Definition.
HTML_KBD HTML Keyboard.
HTML_NORMAL Normal (Web).
HTML_PRE HTML Preformatted.
HTML_SAMP HTML Sample.
HTML_TT HTML Typewriter.
HTML_VAR HTML Variable.
HYPERLINK Hyperlink.
HYPERLINK_FOLLOWED Followed Hyperlink.
INDEX_1 Index 1.
INDEX_2 Index 2.
INDEX_3 Index 3.
INDEX_4 Index 4.
INDEX_5 Index 5.
INDEX_6 Index 6.
INDEX_7 Index 7.
INDEX_8 Index 8.
INDEX_9 Index 9.
INDEX_HEADING Index Heading
INTENSE_EMPHASIS Intense Emphasis.
INTENSE_QUOTE Intense Quote.
INTENSE_REFERENCE Intense Reference.
LINE_NUMBER Line Number.
LIST List.
3.9. Enumerations
55
LIST_2 List 2.
LIST_3 List 3.
LIST_4 List 4.
LIST_5 List 5.
LIST_BULLET List Bullet.
LIST_BULLET_2 List Bullet 2.
LIST_BULLET_3 List Bullet 3.
LIST_BULLET_4 List Bullet 4.
LIST_BULLET_5 List Bullet 5.
LIST_CONTINUE List Continue.
LIST_CONTINUE_2 List Continue 2.
LIST_CONTINUE_3 List Continue 3.
LIST_CONTINUE_4 List Continue 4.
LIST_CONTINUE_5 List Continue 5.
LIST_NUMBER List Number.
LIST_NUMBER_2 List Number 2.
LIST_NUMBER_3 List Number 3.
LIST_NUMBER_4 List Number 4.
LIST_NUMBER_5 List Number 5.
LIST_PARAGRAPH List Paragraph.
MACRO_TEXT Macro Text.
MESSAGE_HEADER Message Header.
NAV_PANE Document Map.
NORMAL Normal.
NORMAL_INDENT Normal Indent.
NORMAL_OBJECT Normal (applied to an object).
NORMAL_TABLE Normal (applied within a table).
NOTE_HEADING Note Heading.
PAGE_NUMBER Page Number.
PLAIN_TEXT Plain Text.
QUOTE Quote.
SALUTATION Salutation.
SIGNATURE Signature.
STRONG Strong.
SUBTITLE Subtitle.
SUBTLE_EMPHASIS Subtle Emphasis.
56
3.9. Enumerations
57
3.9.5 WD_LINE_SPACING
Specifies a line spacing format to be applied to a paragraph.
Example:
from docx.enum.text import WD_LINE_SPACING
paragraph = document.add_paragraph()
paragraph.line_spacing_rule = WD_LINE_SPACING.EXACTLY
3.9.6 WD_ORIENTATION
alias: WD_ORIENT
Specifies the page layout orientation.
Example:
from docx.enum.section import WD_ORIENT
section = document.sections[-1]
section.orientation = WD_ORIENT.LANDSCAPE
3.9.7 WD_SECTION_START
alias: WD_SECTION
Specifies the start type of a section break.
Example:
from docx.enum.section import WD_SECTION
section = document.sections[0]
section.start_type = WD_SECTION.NEW_PAGE
58
3.9.8 WD_STYLE_TYPE
Specifies one of the four style types: paragraph, character, list, or table.
Example:
from docx import Document
from docx.enum.style import WD_STYLE_TYPE
styles = Document().styles
assert styles[0].type == WD_STYLE_TYPE.PARAGRAPH
3.9.9 WD_TABLE_ALIGNMENT
Specifies table justification type.
Example:
from docx.enum.table import WD_TABLE_ALIGNMENT
table = document.add_table(3, 3)
table.alignment = WD_TABLE_ALIGNMENT.CENTER
LEFT Left-aligned
CENTER Center-aligned.
RIGHT Right-aligned.
3.9.10 WD_TABLE_DIRECTION
Specifies the direction in which an application orders cells in the specified table or row.
Example:
from docx.enum.table import WD_TABLE_DIRECTION
table = document.add_table(3, 3)
table.direction = WD_TABLE_DIRECTION.RTL
3.9. Enumerations
59
LTR The table or row is arranged with the first column in the leftmost position.
RTL The table or row is arranged with the first column in the rightmost position.
3.9.11 WD_UNDERLINE
Specifies the style of underline applied to a run of characters.
NONE No underline. This setting overrides any inherited underline value, so can be used to remove underline from
a run that inherits underlining from its containing paragraph. Note this is not the same as assigning None to
Run.underline. None is a valid assignment value, but causes the run to inherit its underline value. Assigning
WD_UNDERLINE.NONE causes underlining to be unconditionally turned off.
SINGLE A single line.
Note that this setting is write-only in the sense that True (rather than
WD_UNDERLINE.SINGLE) is returned for a run having this setting.
WORDS Underline individual words only.
DOUBLE A double line.
DOTTED Dots.
THICK A single thick line.
DASH Dashes.
DOT_DASH Alternating dots and dashes.
DOT_DOT_DASH An alternating dot-dot-dash pattern.
WAVY A single wavy line.
DOTTED_HEAVY Heavy dots.
DASH_HEAVY Heavy dashes.
DOT_DASH_HEAVY Alternating heavy dots and heavy dashes.
DOT_DOT_DASH_HEAVY An alternating heavy dot-dot-dash pattern.
WAVY_HEAVY A heavy wavy line.
DASH_LONG Long dashes.
WAVY_DOUBLE A double wavy line.
DASH_LONG_HEAVY Long heavy dashes.
60
CHAPTER 4
Contributor Guide
4.1 Analysis
Documentation of studies undertaken in support of API and code design.
WordprocessingML supports a variety of paragraph formatting attributes to control layout characteristics such as
justification, indentation, line spacing, space before and after, and widow/orphan control.
Alignment (justification) In Word, each paragraph has an alignment attribute that specifies how to justify the lines
of the paragraph when the paragraph is laid out on the page. Common values are left, right, centered, and justified.
Protocol Getting and setting paragraph alignment:
>>> paragraph = body.add_paragraph()
>>> paragraph.alignment
None
>>> paragraph.alignment = WD_ALIGN_PARAGRAPH.RIGHT
>>> paragraph.alignment
RIGHT (2)
>>> paragraph.alignment = None
>>> paragraph.alignment
None
XML Semantics If the <w:jc> element is not present on a paragraph, the alignment value for that paragraph is
inherited from its style hierarchy. If the element is present, its value overrides any inherited value. From the API,
a value of None on the Paragraph.alignment property corresponds to no <w:jc> element being present. If
None is assigned to Paragraph.alignment, the <w:jc> element is removed.
61
Paragraph spacing Spacing between subsequent paragraphs is controlled by the paragraph spacing attributes. Spacing can be applied either before the paragraph, after it, or both. The concept is similar to that of padding or margin
in CSS. WordprocessingML supports paragraph spacing specified as either a length value or as a multiple of the line
height; however only a length value is supported via the Word UI. Inter-paragraph spacing overlaps, such that the
rendered spacing between two paragraphs is the maximum of the space after the first paragraph and the space before
the second.
Protocol Getting and setting paragraph spacing:
>>> paragraph_format = document.styles['Normal'].paragraph_format
>>> paragraph_format.space_before
None
>>> paragraph_format.space_before = Pt(12)
>>> paragraph_format.space_before.pt
12.0
XML Semantics
Paragraph spacing is specified using the w:pPr/w:spacing element, which also controls line spacing. Spacing is
specified in twips.
If the w:spacing element is not present, paragraph spacing is inherited from the style hierarchy.
If not present in the style hierarchy, the paragraph will have no spacing.
If the w:spacing element is present but the specific attribute (e.g. w:before) is not, its value is inherited.
Specimen XML 12 pt space before, 0 after:
<w:pPr>
<w:spacing w:before="240" w:after="0"/>
</w:pPr>
Line spacing Line spacing can be specified either as a specific length or as a multiple of the line height (font
size). Line spacing is specified by the combination of values in w:spacing/@w:line and w:spacing/@w:lineRule. The
ParagraphFormat.line_spacing property determines which method to use based on whether the assigned
value is an instance of Length.
Protocol Getting and setting line spacing:
>>> paragraph_format.line_spacing, paragraph_format.line_spacing_rule
(None, None)
>>> paragraph_format.line_spacing = Pt(18)
>>> paragraph_format.line_spacing, paragraph_format.line_spacing_rule
(228600, WD_LINE_SPACING.EXACTLY (4))
>>> paragraph_format.line_spacing = 1
>>> paragraph_format.line_spacing, paragraph_format.line_spacing_rule
(152400, WD_LINE_SPACING.SINGLE (0))
>>> paragraph_format.line_spacing = 0.9
>>> paragraph_format.line_spacing, paragraph_format.line_spacing_rule
(137160, WD_LINE_SPACING.MULTIPLE (5))
62
XML Semantics
Line spacing is specified by the combination of the values in w:spacing/@w:line and w:spacing/@w:lineRule.
w:spacing/@w:line is specified in twips. If @w:lineRule is auto (or missing), @w:line is interpreted as 240ths
of a line. For all other values of @w:lineRule, the value of @w:line is interpreted as a specific length in twips.
If the w:spacing element is not present, line spacing is inherited.
If @w:line is not present, line spacing is inherited.
If not present, @w:lineRule defaults to auto.
If not present in the style hierarchy, line spacing defaults to single spaced.
The atLeast value for @w:lineRule indicates the line spacing will be @w:line twips or single spaced, whichever
is greater.
Specimen XML 14 points:
<w:pPr>
<w:spacing w:line="280"/>
</w:pPr>
double-spaced:
<w:pPr>
<w:spacing w:line="480" w:lineRule="exact"/>
</w:pPr>
Indentation Paragraph indentation is specified using the w:pPr/w:ind element. Left, right, first line, and hanging indent can be specified. Indentation can be specified as a length or in hundredths of a character width.
Only length is supported by python-docx. Both first line indent and hanging indent are specified using the
ParagraphFormat.first_line_indent property. Assigning a positive value produces an indented first line.
A negative value produces a hanging indent.
Protocol Getting and setting indentation:
>>> paragraph_format.left_indent
None
>>> paragraph_format.right_indent
None
>>> paragraph_format.first_line_indent
None
>>> paragraph_format.left_indent = Pt(36)
>>> paragraph_format.left_indent.pt
36.0
>>> paragraph_format.right_indent = Inches(0.25)
>>> paragraph_format.right_indent.pt
18.0
>>> paragraph_format.first_line_indent = Pt(-18)
>>> paragraph_format.first_line_indent.pt
-18.0
4.1. Analysis
63
XML Semantics
Indentation is specified by w:ind/@w:start, w:ind/@w:end, w:ind/@w:firstLine, and w:ind/@w:hanging.
w:firstLine and w:hanging are mutually exclusive, if both are specified, w:firstLine is ignored.
All four attributes are specified in twips.
w:start controls left indent for a left-to-right paragraph or right indent for a right-to-left paragraph. w:end
controls the other side. If mirrorIndents is specified, w:start controls the inside margin and w:end the outside.
Negative values are permitted and cause the text to move past the text margin.
If w:ind is not present, indentation is inherited.
Any omitted attributes are inherited.
If not present in the style hierarchy, indentation values default to zero.
Specimen XML 1 inch left, 0.5 inch (additional) first line, 0.5 inch right:
<w:pPr>
<w:ind w:start="1440" w:end="720" w:firstLine="720"/>
</w:pPr>
Page placement There are a handful of page placement properties that control such things as keeping the lines of
a paragraph together on the same page, keeing a paragraph (such as a heading) on the same page as the subsequent
paragraph, and placing the paragraph at the top of a new page. Each of these are tri-state boolean properties where
None indicates inherit.
Protocol Getting and setting indentation:
>>> paragraph_format.keep_with_next
None
>>> paragraph_format.keep_together
None
>>> paragraph_format.page_break_before
None
>>> paragraph_format.widow_control
None
>>> paragraph_format.keep_with_next = True
>>> paragraph_format.keep_with_next
True
>>> paragraph_format.keep_together = False
>>> paragraph_format.keep_together
False
>>> paragraph_format.page_break_before = True
>>> paragraph_format.widow_control = None
64
XML Semantics
All four elements have On/Off semantics.
If not present, their value is inherited.
If not present in the style hierarchy, values default to False.
Specimen XML keep with next, keep together, no page break before, and widow/orphan control:
<w:pPr>
<w:keepNext/>
<w:keepLines/>
<w:pageBreakBefore w:val="0"/>
<w:widowControl/>
</w:pPr>
Enumerations
WD_LINE_SPACING
WD_PARAGRAPH_ALIGNMENT
Specimen XML A paragraph with inherited alignment:
<w:p>
<w:r>
<w:t>Inherited paragraph alignment.</w:t>
</w:r>
</w:p>
A right-aligned paragraph:
<w:p>
<w:pPr>
<w:jc w:val="right"/>
</w:pPr>
<w:r>
<w:t>Right-aligned paragraph.</w:t>
</w:r>
</w:p>
Schema excerpt
<xsd:complexType
<xsd:sequence>
<xsd:element
<xsd:element
<xsd:element
<xsd:element
<xsd:element
<xsd:element
<xsd:element
<xsd:element
<xsd:element
<xsd:element
<xsd:element
<xsd:element
4.1. Analysis
name="CT_PPr">
name="pStyle"
name="keepNext"
name="keepLines"
name="pageBreakBefore"
name="framePr"
name="widowControl"
name="numPr"
name="suppressLineNumbers"
name="pBdr"
name="shd"
name="tabs"
name="suppressAutoHyphens"
type="CT_String"
type="CT_OnOff"
type="CT_OnOff"
type="CT_OnOff"
type="CT_FramePr"
type="CT_OnOff"
type="CT_NumPr"
type="CT_OnOff"
type="CT_PBdr"
type="CT_Shd"
type="CT_Tabs"
type="CT_OnOff"
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
65
<xsd:element name="kinsoku"
<xsd:element name="wordWrap"
<xsd:element name="overflowPunct"
<xsd:element name="topLinePunct"
<xsd:element name="autoSpaceDE"
<xsd:element name="autoSpaceDN"
<xsd:element name="bidi"
<xsd:element name="adjustRightInd"
<xsd:element name="snapToGrid"
<xsd:element name="spacing"
<xsd:element name="ind"
<xsd:element name="contextualSpacing"
<xsd:element name="mirrorIndents"
<xsd:element name="suppressOverlap"
<xsd:element name="jc"
<xsd:element name="textDirection"
<xsd:element name="textAlignment"
<xsd:element name="textboxTightWrap"
<xsd:element name="outlineLvl"
<xsd:element name="divId"
<xsd:element name="cnfStyle"
<xsd:element name="rPr"
<xsd:element name="sectPr"
<xsd:element name="pPrChange"
</xsd:sequence>
</xsd:complexType>
type="CT_OnOff"
type="CT_OnOff"
type="CT_OnOff"
type="CT_OnOff"
type="CT_OnOff"
type="CT_OnOff"
type="CT_OnOff"
type="CT_OnOff"
type="CT_OnOff"
type="CT_Spacing"
type="CT_Ind"
type="CT_OnOff"
type="CT_OnOff"
type="CT_OnOff"
type="CT_Jc"
type="CT_TextDirection"
type="CT_TextAlignment"
type="CT_TextboxTightWrap"
type="CT_DecimalNumber"
type="CT_DecimalNumber"
type="CT_Cnf"
type="CT_ParaRPr"
type="CT_SectPr"
type="CT_PPrChange"
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
<xsd:complexType name="CT_FramePr">
<xsd:attribute name="dropCap"
type="ST_DropCap"/>
<xsd:attribute name="lines"
type="ST_DecimalNumber"/>
<xsd:attribute name="w"
type="s:ST_TwipsMeasure"/>
<xsd:attribute name="h"
type="s:ST_TwipsMeasure"/>
<xsd:attribute name="vSpace"
type="s:ST_TwipsMeasure"/>
<xsd:attribute name="hSpace"
type="s:ST_TwipsMeasure"/>
<xsd:attribute name="wrap"
type="ST_Wrap"/>
<xsd:attribute name="hAnchor"
type="ST_HAnchor"/>
<xsd:attribute name="vAnchor"
type="ST_VAnchor"/>
<xsd:attribute name="x"
type="ST_SignedTwipsMeasure"/>
<xsd:attribute name="xAlign"
type="s:ST_XAlign"/>
<xsd:attribute name="y"
type="ST_SignedTwipsMeasure"/>
<xsd:attribute name="yAlign"
type="s:ST_YAlign"/>
<xsd:attribute name="hRule"
type="ST_HeightRule"/>
<xsd:attribute name="anchorLock" type="s:ST_OnOff"/>
</xsd:complexType>
<xsd:complexType
<xsd:attribute
<xsd:attribute
<xsd:attribute
<xsd:attribute
<xsd:attribute
<xsd:attribute
<xsd:attribute
<xsd:attribute
<xsd:attribute
<xsd:attribute
<xsd:attribute
<xsd:attribute
66
name="CT_Ind">
name="start"
name="startChars"
name="end"
name="endChars"
name="left"
name="leftChars"
name="right"
name="rightChars"
name="hanging"
name="hangingChars"
name="firstLine"
name="firstLineChars"
type="ST_SignedTwipsMeasure"/>
type="ST_DecimalNumber"/>
type="ST_SignedTwipsMeasure"/>
type="ST_DecimalNumber"/>
type="ST_SignedTwipsMeasure"/>
type="ST_DecimalNumber"/>
type="ST_SignedTwipsMeasure"/>
type="ST_DecimalNumber"/>
type="s:ST_TwipsMeasure"/>
type="ST_DecimalNumber"/>
type="s:ST_TwipsMeasure"/>
type="ST_DecimalNumber"/>
</xsd:complexType>
<xsd:complexType name="CT_Jc">
<xsd:attribute name="val" type="ST_Jc" use="required"/>
</xsd:complexType>
<xsd:complexType name="CT_OnOff">
<xsd:attribute name="val" type="s:ST_OnOff"/>
</xsd:complexType>
<xsd:complexType name="CT_Spacing">
<xsd:attribute name="before"
<xsd:attribute name="beforeLines"
<xsd:attribute name="beforeAutospacing"
<xsd:attribute name="after"
<xsd:attribute name="afterLines"
<xsd:attribute name="afterAutospacing"
<xsd:attribute name="line"
<xsd:attribute name="lineRule"
</xsd:complexType>
type="s:ST_TwipsMeasure"/>
type="ST_DecimalNumber"/>
type="s:ST_OnOff"/>
type="s:ST_TwipsMeasure"/>
type="ST_DecimalNumber"/>
type="s:ST_OnOff"/>
type="ST_SignedTwipsMeasure"/>
type="ST_LineSpacingRule"/>
<xsd:complexType name="CT_String">
<xsd:attribute name="val" type="s:ST_String" use="required"/>
</xsd:complexType>
<xsd:complexType name="CT_Tabs">
<xsd:sequence>
<xsd:element name="tab" type="CT_TabStop" maxOccurs="unbounded"/>
</xsd:sequence>
</xsd:complexType>
<!-- simple types -->
<xsd:simpleType name="ST_Jc">
<xsd:restriction base="xsd:string">
<xsd:enumeration value="start"/>
<xsd:enumeration value="center"/>
<xsd:enumeration value="end"/>
<xsd:enumeration value="both"/>
<xsd:enumeration value="mediumKashida"/>
<xsd:enumeration value="distribute"/>
<xsd:enumeration value="numTab"/>
<xsd:enumeration value="highKashida"/>
<xsd:enumeration value="lowKashida"/>
<xsd:enumeration value="thaiDistribute"/>
<xsd:enumeration value="left"/>
<xsd:enumeration value="right"/>
</xsd:restriction>
</xsd:simpleType>
<xsd:simpleType name="ST_LineSpacingRule">
<xsd:restriction base="xsd:string">
<xsd:enumeration value="auto"/> <!-- default -->
<xsd:enumeration value="exact"/>
<xsd:enumeration value="atLeast"/>
</xsd:restriction>
</xsd:simpleType>
4.1. Analysis
67
<xsd:simpleType name="ST_OnOff">
<xsd:union memberTypes="xsd:boolean ST_OnOff1"/>
</xsd:simpleType>
<xsd:simpleType name="ST_OnOff1">
<xsd:restriction base="xsd:string">
<xsd:enumeration value="on"/>
<xsd:enumeration value="off"/>
</xsd:restriction>
</xsd:simpleType>
Font
Word supports a rich variety of character formatting. Character formatting can be applied at various levels in the style
hierarchy. At the lowest level, it can be applied directly to a run of text content. Above that, it can be applied to
character, paragraph and table styles. It can also be applied to an abstract numbering definition. At the highest levels
it can be applied via a theme or document defaults.
Typeface name Word allows multiple typefaces to be specified for character content in a single run. This allows
different Unicode character ranges such as ASCII and Arabic to be used in a single run, each being rendered in the
typeface specified for that range.
Up to eight distinct typefaces may be specified for a font. Four are used to specify a typeface for a distinct code point
range. These are:
w:ascii - used for the first 128 Unicode code points
w:cs - used for complex script code points
w:eastAsia - used for East Asian code points
w:hAnsi - standing for high ANSI, but effectively the catch-all for any code points not specified by one of the
other three.
The other four, w:asciiTheme, w:csTheme, w:eastAsiaTheme, and w:hAnsiTheme are used to indirectly specify a
theme-defined font. This allows the typeface to be set centrally in the document. These four attributes have lower
precedence than the first four, so for example the value of w:asciiTheme is ignored if a w:ascii attribute is also present.
The typeface name used for a run is specified in the w:rPr/w:rFonts element. There are 8 attributes that in combination
specify the typeface to be used.
Protocol Initially, only the base typeface name is supported by the API, using the name property. Its value is the
that of the w:rFonts/@w:ascii attribute or None if not present. Assignment to this property sets both the w:ascii and
the w:hAnsi attribute to the assigned string or removes them both if None is assigned:
>>> font = document.styles['Normal'].font
>>> font.name
None
>>> font.name = 'Arial'
>>> font.name
'Arial'
Boolean run properties Character formatting that is either on or off, such as bold, italic, and small caps. Certain
of these properties are toggle properties that may cancel each other out if they appear more than once in the style
hierarchy. See 17.7.3 for more details on toggle properties. They dont affect the API specified here.
68
spec
17.3.2.1
17.3.2.2
17.3.2.5
17.3.2.7
17.3.2.9
17.3.2.13
17.3.2.16
17.3.2.17
17.3.2.18
17.3.2.21
17.3.2.22
17.3.2.23
17.3.2.30
17.3.2.31
17.3.2.33
17.3.2.34
17.3.2.36
17.3.2.37
17.3.2.41
17.3.2.44
name
Bold
Complex Script Bold
Display All Characters as Capital Letters
Use Complex Script Formatting on Run
Double Strikethrough
Embossing
Italics
Complex Script Italics
Imprinting
Do Not Check Spelling or Grammar
Office Open XML Math
Display Character Outline
Right To Left Text
Shadow
Small Caps
Use Document Grid Settings For Inter- Character Spacing
Paragraph Mark is Always Hidden
Single Strikethrough
Hidden Text
Web Hidden Text
Protocol At the API level, each of the boolean run properties is a read/write tri-state property, having the possible
values True, False, and None.
The following interactive session demonstrates the protocol for querying and applying run-level properties:
>>> run = p.add_run()
>>> run.bold
None
>>> run.bold = True
>>> run.bold
True
>>> run.bold = False
>>> run.bold
False
>>> run.bold = None
>>> run.bold
None
4.1. Analysis
69
This behavior allows these properties to be overridden (turned off) in inheriting styles. For example, consider a
character style emphasized that sets bold on. Another style, strong inherits from emphasized, but should display in
italic rather than bold. Setting bold off has no effect because it is overridden by the bold in strong (I think). Because
bold is a toggle property, setting bold on in emphasized causes its value to be toggled, to False, achieving the desired
effect. See 17.7.3 for more details on toggle properties.
The following run properties are toggle properties:
element
<b/>
<bCs/>
<caps/>
<emboss/>
<i/>
<iCs/>
<imprint/>
<outline/>
<shadow/>
<smallCaps/>
<strike/>
<vanish/>
spec
17.3.2.1
17.3.2.2
17.3.2.5
17.3.2.13
17.3.2.16
17.3.2.17
17.3.2.18
17.3.2.23
17.3.2.31
17.3.2.33
17.3.2.37
17.3.2.41
name
Bold
Complex Script Bold
Display All Characters as Capital Letters
Embossing
Italics
Complex Script Italics
Imprinting
Display Character Outline
Shadow
Small Caps
Single Strikethrough
Hidden Text
Specimen XML
<w:r>
<w:rPr>
<w:b/>
<w:i/>
<w:smallCaps/>
<w:strike/>
<w:sz w:val="28"/>
<w:szCs w:val="28"/>
<w:u w:val="single"/>
</w:rPr>
<w:t>bold, italic, small caps, strike, 14 pt, and underline</w:t>
</w:r>
Schema excerpt It appears the run properties may appear in any order and may appear multiple times each. Not
sure what the semantics of that would be or why one would want to do it, but something to note. Word seems to place
them in the order below when it writes the file.:
<xsd:complexType name="CT_RPr"> <!-- denormalized -->
<xsd:sequence>
<xsd:choice minOccurs="0" maxOccurs="unbounded"/>
<xsd:element name="rStyle"
type="CT_String"/>
<xsd:element name="rFonts"
type="CT_Fonts"/>
<xsd:element name="b"
type="CT_OnOff"/>
<xsd:element name="bCs"
type="CT_OnOff"/>
<xsd:element name="i"
type="CT_OnOff"/>
<xsd:element name="iCs"
type="CT_OnOff"/>
<xsd:element name="caps"
type="CT_OnOff"/>
<xsd:element name="smallCaps"
type="CT_OnOff"/>
<xsd:element name="strike"
type="CT_OnOff"/>
<xsd:element name="dstrike"
type="CT_OnOff"/>
<xsd:element name="outline"
type="CT_OnOff"/>
<xsd:element name="shadow"
type="CT_OnOff"/>
70
<xsd:element name="emboss"
type="CT_OnOff"/>
<xsd:element name="imprint"
type="CT_OnOff"/>
<xsd:element name="noProof"
type="CT_OnOff"/>
<xsd:element name="snapToGrid"
type="CT_OnOff"/>
<xsd:element name="vanish"
type="CT_OnOff"/>
<xsd:element name="webHidden"
type="CT_OnOff"/>
<xsd:element name="color"
type="CT_Color"/>
<xsd:element name="spacing"
type="CT_SignedTwipsMeasure"/>
<xsd:element name="w"
type="CT_TextScale"/>
<xsd:element name="kern"
type="CT_HpsMeasure"/>
<xsd:element name="position"
type="CT_SignedHpsMeasure"/>
<xsd:element name="sz"
type="CT_HpsMeasure"/>
<xsd:element name="szCs"
type="CT_HpsMeasure"/>
<xsd:element name="highlight"
type="CT_Highlight"/>
<xsd:element name="u"
type="CT_Underline"/>
<xsd:element name="effect"
type="CT_TextEffect"/>
<xsd:element name="bdr"
type="CT_Border"/>
<xsd:element name="shd"
type="CT_Shd"/>
<xsd:element name="fitText"
type="CT_FitText"/>
<xsd:element name="vertAlign"
type="CT_VerticalAlignRun"/>
<xsd:element name="rtl"
type="CT_OnOff"/>
<xsd:element name="cs"
type="CT_OnOff"/>
<xsd:element name="em"
type="CT_Em"/>
<xsd:element name="lang"
type="CT_Language"/>
<xsd:element name="eastAsianLayout" type="CT_EastAsianLayout"/>
<xsd:element name="specVanish"
type="CT_OnOff"/>
<xsd:element name="oMath"
type="CT_OnOff"/>
</xsd:choice>
<xsd:element name="rPrChange" type="CT_RPrChange" minOccurs="0"/>
</xsd:sequence>
</xsd:group>
<xsd:complexType name="CT_Fonts">
<xsd:attribute name="hint"
<xsd:attribute name="ascii"
<xsd:attribute name="hAnsi"
<xsd:attribute name="eastAsia"
<xsd:attribute name="cs"
<xsd:attribute name="asciiTheme"
<xsd:attribute name="hAnsiTheme"
<xsd:attribute name="eastAsiaTheme"
<xsd:attribute name="cstheme"
</xsd:complexType>
type="ST_Hint"/>
type="s:ST_String"/>
type="s:ST_String"/>
type="s:ST_String"/>
type="s:ST_String"/>
type="ST_Theme"/>
type="ST_Theme"/>
type="ST_Theme"/>
type="ST_Theme"/>
<xsd:complexType name="CT_HpsMeasure">
<xsd:attribute name="val" type="ST_HpsMeasure" use="required"/>
</xsd:complexType>
<xsd:complexType name="CT_OnOff">
<xsd:attribute name="val" type="s:ST_OnOff"/>
</xsd:complexType>
<xsd:complexType name="CT_SignedHpsMeasure">
<xsd:attribute name="val" type="ST_SignedHpsMeasure" use="required"/>
</xsd:complexType>
<xsd:complexType name="CT_String">
<xsd:attribute name="val" type="s:ST_String" use="required"/>
4.1. Analysis
71
</xsd:complexType>
<xsd:complexType name="CT_Underline">
<xsd:attribute name="val"
type="ST_Underline"/>
<xsd:attribute name="color"
type="ST_HexColor"/>
<xsd:attribute name="themeColor" type="ST_ThemeColor"/>
<xsd:attribute name="themeTint" type="ST_UcharHexNumber"/>
<xsd:attribute name="themeShade" type="ST_UcharHexNumber"/>
</xsd:complexType>
<xsd:complexType name="CT_VerticalAlignRun">
<xsd:attribute name="val" type="s:ST_VerticalAlignRun" use="required"/>
</xsd:complexType>
<!-- simple types -->
<xsd:simpleType name="ST_Hint">
<xsd:restriction base="xsd:string">
<xsd:enumeration value="default"/>
<xsd:enumeration value="eastAsia"/>
<xsd:enumeration value="cs"/>
</xsd:restriction>
</xsd:simpleType>
<xsd:simpleType name="ST_HpsMeasure">
<xsd:union memberTypes="s:ST_UnsignedDecimalNumber
s:ST_PositiveUniversalMeasure"/>
</xsd:simpleType>
<xsd:simpleType name="ST_OnOff">
<xsd:union memberTypes="xsd:boolean ST_OnOff1"/>
</xsd:simpleType>
<xsd:simpleType name="ST_OnOff1">
<xsd:restriction base="xsd:string">
<xsd:enumeration value="on"/>
<xsd:enumeration value="off"/>
</xsd:restriction>
</xsd:simpleType>
<xsd:simpleType name="ST_PositiveUniversalMeasure">
<xsd:restriction base="ST_UniversalMeasure">
<xsd:pattern value="[0-9]+(\.[0-9]+)?(mm|cm|in|pt|pc|pi)"/>
</xsd:restriction>
</xsd:simpleType>
<xsd:simpleType name="ST_SignedHpsMeasure">
<xsd:union memberTypes="xsd:integer s:ST_UniversalMeasure"/>
</xsd:simpleType>
<xsd:simpleType name="ST_Theme">
<xsd:restriction base="xsd:string">
<xsd:enumeration value="majorEastAsia"/>
<xsd:enumeration value="majorBidi"/>
<xsd:enumeration value="majorAscii"/>
<xsd:enumeration value="majorHAnsi"/>
<xsd:enumeration value="minorEastAsia"/>
<xsd:enumeration value="minorBidi"/>
72
<xsd:enumeration value="minorAscii"/>
<xsd:enumeration value="minorHAnsi"/>
</xsd:restriction>
</xsd:simpleType>
<xsd:simpleType name="ST_Underline">
<xsd:restriction base="xsd:string">
<xsd:enumeration value="single"/>
<xsd:enumeration value="words"/>
<xsd:enumeration value="double"/>
<xsd:enumeration value="thick"/>
<xsd:enumeration value="dotted"/>
<xsd:enumeration value="dottedHeavy"/>
<xsd:enumeration value="dash"/>
<xsd:enumeration value="dashedHeavy"/>
<xsd:enumeration value="dashLong"/>
<xsd:enumeration value="dashLongHeavy"/>
<xsd:enumeration value="dotDash"/>
<xsd:enumeration value="dashDotHeavy"/>
<xsd:enumeration value="dotDotDash"/>
<xsd:enumeration value="dashDotDotHeavy"/>
<xsd:enumeration value="wave"/>
<xsd:enumeration value="wavyHeavy"/>
<xsd:enumeration value="wavyDouble"/>
<xsd:enumeration value="none"/>
</xsd:restriction>
</xsd:simpleType>
<xsd:simpleType name="ST_UnsignedDecimalNumber">
<xsd:restriction base="xsd:unsignedLong"/>
</xsd:simpleType>
<xsd:simpleType name="ST_VerticalAlignRun">
<xsd:restriction base="xsd:string">
<xsd:enumeration value="baseline"/>
<xsd:enumeration value="superscript"/>
<xsd:enumeration value="subscript"/>
</xsd:restriction>
</xsd:simpleType>
Font Color
Color, as a topic, extends beyond the Font object; font color is just the first place its come up. Accordingly, it bears
a little deeper thought than usual since well want to reuse the same objects and protocol to specify color in the other
contexts; it makes sense to craft a general solution that will bear the expected reuse.
There are three historical sources to draw from for this API.
1. The w:rPr/w:color element. This is used by default when applying color directly to text or when setting the
text color of a style. This corresponds to the Font.Color property (undocumented, unfortunately). This element
supports RGB colors, theme colors, and a tint or shade of a theme color.
2. The w:rPr/w14:textFill element. This is used by Word for fancy text like gradient and shadow effects. This
corresponds to the Font.Fill property.
3. The PowerPoint font color UI. This seems like a reasonable compromise between the prior two, allowing directish access to common color options while holding the door open for the Font.fill operations to be added later if
required.
4.1. Analysis
73
docx.dml.color.ColorFormat has a read-only type property and read/write rgb, theme_color, and
brightness properties.
ColorFormat.type
returns
one
of
MSO_COLOR_TYPE.RGB,
MSO_COLOR_TYPE.THEME,
MSO_COLOR_TYPE.AUTO, or None, the latter indicating font has no directly-applied color:
>>> font.color.type
None
ColorFormat.rgb returns an RGBColor object when type is MSO_COLOR_TYPE.RGB. It may also report an
RGBColor value when type is MSO_COLOR_TYPE.THEME, since an RGB color may also be present in that case.
According to the spec, the RGB color value is ignored when a theme color is specified, but Word writes the current
RGB value of the theme color along with the theme color name (e.g. accent1) when assigning a theme color; perhaps
as a convenient value for a file browser to use. The value of .type must be consulted to determine whether the RGB
value is operative or a best-guess:
>>> font.color.type
RGB (1)
>>> font.color.rgb
RGBColor(0x3f, 0x2c, 0x36)
Assigning an RGBColor
MSO_COLOR_TYPE.RGB:
value
to
ColorFormat.rgb
causes
ColorFormat.type
to
become
>>> font.color.type
None
>>> font.color.rgb = RGBColor(0x3f, 0x2c, 0x36)
>>> font.color.type
RGB (1)
>>> font.color.rgb
RGBColor(0x3f, 0x2c, 0x36)
ColorFormat.theme_color
MSO_COLOR_TYPE.THEME:
returns
member
of
MSO_THEME_COLOR_INDEX
when
type
is
>>> font.color.type
THEME (2)
>>> font.color.theme_color
ACCENT_1 (5)
74
ColorFormat.theme_color
causes
>>> font.color.type
RGB (1)
>>> font.color.theme_color = MSO_THEME_COLOR.ACCENT_2
>>> font.color.type
THEME (2)
>>> font.color.theme_color
ACCENT_2 (6)
The ColorFormat.brightness attribute can be used to select a tint or shade of a theme color. Assigning the
value 0.1 produces a color 10% brighter (a tint); assigning -0.1 produces a color 10% darker (a shade):
>>> font.color.type
None
>>> font.color.brightness
0.0
>>> font.color.brightness = 0.4
ValueError: not a theme color
>>> font.color.theme_color = MSO_THEME_COLOR.TEXT_1
>>> font.color.brightness = 0.4
>>> font.color.brightness
0.4
75
<w:r>
<w:rPr>
<w:color w:val="548DD4" w:themeColor="text2" w:themeTint="99"/>
</w:rPr>
<w:t>Theme color with 40% tint.</w:t>
</w:r>
Schema excerpt
<xsd:complexType name="CT_RPr"> <!-- denormalized -->
<xsd:sequence>
<xsd:choice minOccurs="0" maxOccurs="unbounded"/>
<xsd:element name="rStyle"
type="CT_String"/>
<xsd:element name="rFonts"
type="CT_Fonts"/>
<xsd:element name="b"
type="CT_OnOff"/>
<xsd:element name="bCs"
type="CT_OnOff"/>
<xsd:element name="i"
type="CT_OnOff"/>
<xsd:element name="iCs"
type="CT_OnOff"/>
<xsd:element name="caps"
type="CT_OnOff"/>
<xsd:element name="smallCaps"
type="CT_OnOff"/>
<xsd:element name="strike"
type="CT_OnOff"/>
<xsd:element name="dstrike"
type="CT_OnOff"/>
<xsd:element name="outline"
type="CT_OnOff"/>
<xsd:element name="shadow"
type="CT_OnOff"/>
<xsd:element name="emboss"
type="CT_OnOff"/>
<xsd:element name="imprint"
type="CT_OnOff"/>
<xsd:element name="noProof"
type="CT_OnOff"/>
<xsd:element name="snapToGrid"
type="CT_OnOff"/>
<xsd:element name="vanish"
type="CT_OnOff"/>
<xsd:element name="webHidden"
type="CT_OnOff"/>
<xsd:element name="color"
type="CT_Color"/>
<xsd:element name="spacing"
type="CT_SignedTwipsMeasure"/>
<xsd:element name="w"
type="CT_TextScale"/>
<xsd:element name="kern"
type="CT_HpsMeasure"/>
<xsd:element name="position"
type="CT_SignedHpsMeasure"/>
<xsd:element name="sz"
type="CT_HpsMeasure"/>
<xsd:element name="szCs"
type="CT_HpsMeasure"/>
<xsd:element name="highlight"
type="CT_Highlight"/>
<xsd:element name="u"
type="CT_Underline"/>
<xsd:element name="effect"
type="CT_TextEffect"/>
<xsd:element name="bdr"
type="CT_Border"/>
<xsd:element name="shd"
type="CT_Shd"/>
<xsd:element name="fitText"
type="CT_FitText"/>
<xsd:element name="vertAlign"
type="CT_VerticalAlignRun"/>
<xsd:element name="rtl"
type="CT_OnOff"/>
<xsd:element name="cs"
type="CT_OnOff"/>
<xsd:element name="em"
type="CT_Em"/>
<xsd:element name="lang"
type="CT_Language"/>
<xsd:element name="eastAsianLayout" type="CT_EastAsianLayout"/>
<xsd:element name="specVanish"
type="CT_OnOff"/>
76
<xsd:element name="oMath"
type="CT_OnOff"/>
</xsd:choice>
<xsd:element name="rPrChange" type="CT_RPrChange" minOccurs="0"/>
</xsd:sequence>
</xsd:group>
<xsd:complexType name="CT_Color">
<xsd:attribute name="val"
<xsd:attribute name="themeColor"
<xsd:attribute name="themeTint"
<xsd:attribute name="themeShade"
</xsd:complexType>
type="ST_HexColor" use="required"/>
type="ST_ThemeColor"/>
type="ST_UcharHexNumber"/>
type="ST_UcharHexNumber"/>
4.1. Analysis
77
Underline
Enumerations
WdUnderline Enumeration on MSDN
Specimen XML Baseline run:
<w:r>
<w:t>underlining determined by inheritance</w:t>
</w:r>
Single underline:
<w:r>
<w:rPr>
<w:u w:val="single"/>
</w:rPr>
<w:t>single underlined</w:t>
</w:r>
Double underline:
<w:r>
<w:rPr>
<w:u w:val="double"/>
</w:rPr>
<w:t>single underlined</w:t>
</w:r>
78
Schema excerpt Note that the w:val attribute on CT_Underline is optional. When it is not present no underline
appears on the run.
<xsd:complexType name="CT_R"> <!-- flattened for readibility -->
<xsd:sequence>
<xsd:element name="rPr" type="CT_RPr" minOccurs="0"/>
<xsd:group
ref="EG_RunInnerContent" minOccurs="0" maxOccurs="unbounded"/>
</xsd:sequence>
<xsd:attribute name="rsidRPr" type="ST_LongHexNumber"/>
<xsd:attribute name="rsidDel" type="ST_LongHexNumber"/>
<xsd:attribute name="rsidR"
type="ST_LongHexNumber"/>
</xsd:complexType>
<xsd:complexType name="CT_RPr"> <!-- flattened for readibility -->
<xsd:sequence>
<xsd:group
ref="EG_RPrBase" minOccurs="0" maxOccurs="unbounded"/>
<xsd:element name="rPrChange" type="CT_RPrChange" minOccurs="0"/>
</xsd:sequence>
</xsd:complexType>
<xsd:group name="EG_RPrBase">
<xsd:choice>
<xsd:element name="rStyle"
<xsd:element name="b"
<xsd:element name="i"
<xsd:element name="color"
<xsd:element name="sz"
<xsd:element name="u"
<!-- 33 others -->
</xsd:choice>
</xsd:group>
type="CT_String"/>
type="CT_OnOff"/>
type="CT_OnOff"/>
type="CT_Color"/>
type="CT_HpsMeasure"/>
type="CT_Underline"/>
<xsd:complexType name="CT_Underline">
<xsd:attribute name="val"
type="ST_Underline"/>
<xsd:attribute name="color"
type="ST_HexColor"/>
<xsd:attribute name="themeColor" type="ST_ThemeColor"/>
<xsd:attribute name="themeTint" type="ST_UcharHexNumber"/>
<xsd:attribute name="themeShade" type="ST_UcharHexNumber"/>
</xsd:complexType>
<xsd:simpleType name="ST_Underline">
<xsd:restriction base="xsd:string">
<xsd:enumeration value="single"/>
<xsd:enumeration value="words"/>
<xsd:enumeration value="double"/>
<xsd:enumeration value="thick"/>
<xsd:enumeration value="dotted"/>
<xsd:enumeration value="dottedHeavy"/>
<xsd:enumeration value="dash"/>
4.1. Analysis
79
<xsd:enumeration
<xsd:enumeration
<xsd:enumeration
<xsd:enumeration
<xsd:enumeration
<xsd:enumeration
<xsd:enumeration
<xsd:enumeration
<xsd:enumeration
<xsd:enumeration
<xsd:enumeration
</xsd:restriction>
</xsd:simpleType>
value="dashedHeavy"/>
value="dashLong"/>
value="dashLongHeavy"/>
value="dotDash"/>
value="dashDotHeavy"/>
value="dotDotDash"/>
value="dashDotDotHeavy"/>
value="wave"/>
value="wavyHeavy"/>
value="wavyDouble"/>
value="none"/>
Run-level content
A run is the object most closely associated with inline content; text, pictures, and other items that are flowed between
the block-item boundaries within a paragraph.
main content child elements:
<w:t>
<w:br>
<w:drawing>
<w:tab>
<w:cr>
Schema excerpt
<xsd:complexType name="CT_R"> <!-- denormalized -->
<xsd:sequence>
<xsd:element name="rPr" type="CT_RPr" minOccurs="0"/>
<xsd:group
ref="EG_RunInnerContent" minOccurs="0" maxOccurs="unbounded"/>
</xsd:sequence>
<xsd:attribute name="rsidRPr" type="ST_LongHexNumber"/>
<xsd:attribute name="rsidDel" type="ST_LongHexNumber"/>
<xsd:attribute name="rsidR"
type="ST_LongHexNumber"/>
</xsd:complexType>
<xsd:group name="EG_RunInnerContent">
<xsd:choice>
<xsd:element name="t"
<xsd:element name="br"
<xsd:element name="cr"
<xsd:element name="tab"
<xsd:element name="ptab"
<xsd:element name="sym"
<xsd:element name="noBreakHyphen"
<xsd:element name="softHyphen"
<xsd:element name="fldChar"
<xsd:element name="drawing"
<xsd:element name="object"
<xsd:element name="footnoteReference"
80
type="CT_Text"/>
type="CT_Br"/>
type="CT_Empty"/>
type="CT_Empty"/>
type="CT_PTab"/>
type="CT_Sym"/>
type="CT_Empty"/>
type="CT_Empty"/>
type="CT_FldChar"/>
type="CT_Drawing"/>
type="CT_Object"/>
type="CT_FtnEdnRef"/>
<xsd:element
<xsd:element
<xsd:element
<xsd:element
<xsd:element
<xsd:element
<xsd:element
name="footnoteRef"
name="endnoteReference"
name="endnoteRef"
name="separator"
name="continuationSeparator"
name="commentReference"
name="annotationRef"
type="CT_Empty"/>
type="CT_FtnEdnRef"/>
type="CT_Empty"/>
type="CT_Empty"/>
type="CT_Empty"/>
type="CT_Markup"/>
type="CT_Empty"/>
<xsd:element
<xsd:element
<xsd:element
<xsd:element
name="contentPart"
name="delText"
name="instrText"
name="delInstrText"
type="CT_Rel"/>
type="CT_Text"/>
type="CT_Text"/>
type="CT_Text"/>
<xsd:element
<xsd:element
<xsd:element
<xsd:element
<xsd:element
<xsd:element
name="dayShort"
name="monthShort"
name="yearShort"
name="dayLong"
name="monthLong"
name="yearLong"
type="CT_Empty"/>
type="CT_Empty"/>
type="CT_Empty"/>
type="CT_Empty"/>
type="CT_Empty"/>
type="CT_Empty"/>
name="pgNum"
name="pict"
name="ruby"
name="lastRenderedPageBreak"
type="CT_Empty"/>
type="CT_Picture"/>
type="CT_Ruby"/>
type="CT_Empty"/>
<xsd:element
<xsd:element
<xsd:element
<xsd:element
</xsd:choice>
</xsd:group>
<xsd:complexType name="CT_Empty"/>
Breaks
Word supports a variety of breaks that interrupt the flow of text in the document:
line break
page break
column break
section break (new page, even page, odd page)
In addition, a page break can be forced by formatting a paragraph with the page break before setting.
This analysis is limited to line, page, and column breaks. A section break is implemented using a completely different
set of elements and is covered separately.
Candidate protocol run.add_break() The following interactive session demonstrates the protocol for adding a
page break:
>>> run = p.add_run()
>>> run.breaks
[]
>>> run.add_break() # by default adds WD_BREAK.LINE
>>> run.breaks
[<docx.text.Break object at 0x10a7c4f50>]
>>> run.breaks[0].type.__name__
WD_BREAK.LINE
4.1. Analysis
81
>>> run.add_break(WD_BREAK.LINE)
>>> run.breaks
[<docx.text.Break object at 0x10a7c4f50>, <docx.text.Break object at 0x10a7c4f58>]
>>>
>>>
>>>
>>>
>>>
run.add_break(WD_BREAK.PAGE)
run.add_break(WD_BREAK.COLUMN)
run.add_break(WD_BREAK.LINE_CLEAR_LEFT)
run.add_break(WD_BREAK.LINE_CLEAR_RIGHT)
run.add_break(WD_BREAK.TEXT_WRAPPING)
Enumeration WD_BREAK_TYPE
WD_BREAK.LINE
WD_BREAK.LINE_CLEAR_LEFT
WD_BREAK.LINE_CLEAR_RIGHT
WD_BREAK.TEXT_WRAPPING (e.g. LINE_CLEAR_ALL)
WD_BREAK.PAGE
WD_BREAK.COLUMN
WD_BREAK.SECTION_NEXT_PAGE
WD_BREAK.SECTION_CONTINUOUS
WD_BREAK.SECTION_EVEN_PAGE
WD_BREAK.SECTION_ODD_PAGE
Specimen XML
Line break This XML is produced by Word after inserting a line feed with Shift-Enter:
<w:p>
<w:r>
<w:t>Text before</w:t>
</w:r>
<w:r>
<w:br/>
<w:t>and after line break</w:t>
</w:r>
</w:p>
Word loads this more straightforward generation just fine, although it changes it back on next save. Im not sure of the
advantage in creating a fresh run such that the <w:br/> element is the first child:
<w:p>
<w:r>
<w:t>Text before</w:t>
<w:br/>
<w:t>and after line break</w:t>
</w:r>
</w:p>
82
4.1. Analysis
83
<w:lastRenderedPageBreak/>
<w:t>Text after an intra-run page break</w:t>
</w:r>
</w:p>
<w:p>
<w:r>
<w:t>following paragraph</w:t>
</w:r>
</w:p>
Schema excerpt
<xsd:complexType name="CT_R">
<xsd:sequence>
<xsd:group ref="EG_RPr"
minOccurs="0"/>
<xsd:group ref="EG_RunInnerContent" minOccurs="0" maxOccurs="unbounded"/>
</xsd:sequence>
<xsd:attribute name="rsidRPr" type="ST_LongHexNumber"/>
<xsd:attribute name="rsidDel" type="ST_LongHexNumber"/>
<xsd:attribute name="rsidR"
type="ST_LongHexNumber"/>
</xsd:complexType>
<xsd:group name="EG_RunInnerContent">
<xsd:choice>
<xsd:element name="br"
<xsd:element name="t"
<xsd:element name="contentPart"
<xsd:element name="delText"
<xsd:element name="instrText"
<xsd:element name="delInstrText"
<xsd:element name="noBreakHyphen"
<xsd:element name="softHyphen"
<xsd:element name="dayShort"
<xsd:element name="monthShort"
<xsd:element name="yearShort"
<xsd:element name="dayLong"
<xsd:element name="monthLong"
<xsd:element name="yearLong"
<xsd:element name="annotationRef"
<xsd:element name="footnoteRef"
<xsd:element name="endnoteRef"
<xsd:element name="separator"
<xsd:element name="continuationSeparator"
<xsd:element name="sym"
<xsd:element name="pgNum"
<xsd:element name="cr"
<xsd:element name="tab"
<xsd:element name="object"
<xsd:element name="pict"
<xsd:element name="fldChar"
<xsd:element name="ruby"
<xsd:element name="footnoteReference"
<xsd:element name="endnoteReference"
<xsd:element name="commentReference"
<xsd:element name="drawing"
<xsd:element name="ptab"
<xsd:element name="lastRenderedPageBreak"
</xsd:choice>
84
type="CT_Br"/>
type="CT_Text"/>
type="CT_Rel"/>
type="CT_Text"/>
type="CT_Text"/>
type="CT_Text"/>
type="CT_Empty"/>
type="CT_Empty"/>
type="CT_Empty"/>
type="CT_Empty"/>
type="CT_Empty"/>
type="CT_Empty"/>
type="CT_Empty"/>
type="CT_Empty"/>
type="CT_Empty"/>
type="CT_Empty"/>
type="CT_Empty"/>
type="CT_Empty"/>
type="CT_Empty"/>
type="CT_Sym"/>
type="CT_Empty"/>
type="CT_Empty"/>
type="CT_Empty"/>
type="CT_Object"/>
type="CT_Picture"/>
type="CT_FldChar"/>
type="CT_Ruby"/>
type="CT_FtnEdnRef"/>
type="CT_FtnEdnRef"/>
type="CT_Markup"/>
type="CT_Drawing"/>
type="CT_PTab"/>
type="CT_Empty"/>
</xsd:group>
<xsd:complexType name="CT_Br">
<xsd:attribute name="type" type="ST_BrType"/>
<xsd:attribute name="clear" type="ST_BrClear"/>
</xsd:complexType>
<xsd:simpleType name="ST_BrType">
<xsd:restriction base="xsd:string">
<xsd:enumeration value="page"/>
<xsd:enumeration value="column"/>
<xsd:enumeration value="textWrapping"/>
</xsd:restriction>
</xsd:simpleType>
<xsd:simpleType name="ST_BrClear">
<xsd:restriction base="xsd:string">
<xsd:enumeration value="none"/>
<xsd:enumeration value="left"/>
<xsd:enumeration value="right"/>
<xsd:enumeration value="all"/>
</xsd:restriction>
</xsd:simpleType>
Resources
WdBreakType Enumeration on MSDN
Range.InsertBreak Method (Word) on MSDN
Relevant sections in the ISO Spec
17.18.3 ST_BrClear (Line Break Text Wrapping Restart Location)
Styles
Styles collection
4.1. Analysis
85
>>> styles['undefined-style']
KeyError: no style with id or name 'undefined-style'
Styles.add_style():
>>> style = styles.add_style('Citation', WD_STYLE_TYPE.PARAGRAPH)
>>> style.name
'Citation'
>>> style.type
PARAGRAPH (1)
>>> style.builtin
False
Feature Notes
could add a default builtin style from known specs on first access via WD_BUILTIN_STYLE enumeration:
>>> style = document.styles['Heading1']
KeyError: no style with id or name 'Heading1'
>>> style = document.styles[WD_STYLE.HEADING_1]
>>> assert style == document.styles['Heading1']
Example XML
<?xml version='1.0' encoding='UTF-8' standalone='yes'?>
<w:styles
xmlns:mc="http://schemas.openxmlformats.org/markup-compatibility/2006"
xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships"
xmlns:w14="http://schemas.microsoft.com/office/word/2010/wordml"
xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main"
mc:Ignorable="w14"
>
<w:docDefaults>
<w:rPrDefault>
<w:rPr>
<w:rFonts w:asciiTheme="minorHAnsi" w:eastAsiaTheme="minorEastAsia"
w:hAnsiTheme="minorHAnsi" w:cstheme="minorBidi"/>
<w:sz w:val="24"/>
<w:szCs w:val="24"/>
<w:lang w:val="en-US" w:eastAsia="en-US" w:bidi="ar-SA"/>
</w:rPr>
</w:rPrDefault>
<w:pPrDefault/>
</w:docDefaults>
<w:latentStyles w:defLockedState="0" w:defUIPriority="99" w:defSemiHidden="1"
w:defUnhideWhenUsed="1" w:defQFormat="0" w:count="276">
<w:lsdException w:name="Normal" w:semiHidden="0" w:uiPriority="0"
w:unhideWhenUsed="0" w:qFormat="1"/>
<w:lsdException w:name="heading 1" w:semiHidden="0" w:uiPriority="9"
w:unhideWhenUsed="0" w:qFormat="1"/>
<w:lsdException w:name="heading 2" w:uiPriority="9" w:qFormat="1"/>
<w:lsdException w:name="Default Paragraph Font" w:uiPriority="1"/>
</w:latentStyles>
<w:style w:type="paragraph" w:default="1" w:styleId="Normal">
86
<w:name w:val="Normal"/>
<w:qFormat/>
</w:style>
<w:style w:type="character" w:default="1" w:styleId="DefaultParagraphFont">
<w:name w:val="Default Paragraph Font"/>
<w:uiPriority w:val="1"/>
<w:semiHidden/>
<w:unhideWhenUsed/>
</w:style>
<w:style w:type="table" w:default="1" w:styleId="TableNormal">
<w:name w:val="Normal Table"/>
<w:uiPriority w:val="99"/>
<w:semiHidden/>
<w:unhideWhenUsed/>
<w:tblPr>
<w:tblInd w:w="0" w:type="dxa"/>
<w:tblCellMar>
<w:top w:w="0" w:type="dxa"/>
<w:left w:w="108" w:type="dxa"/>
<w:bottom w:w="0" w:type="dxa"/>
<w:right w:w="108" w:type="dxa"/>
</w:tblCellMar>
</w:tblPr>
</w:style>
<w:style w:type="numbering" w:default="1" w:styleId="NoList">
<w:name w:val="No List"/>
<w:uiPriority w:val="99"/>
<w:semiHidden/>
<w:unhideWhenUsed/>
</w:style>
<w:style w:type="paragraph" w:customStyle="1" w:styleId="Foobar">
<w:name w:val="Foobar"/>
<w:qFormat/>
<w:rsid w:val="004B54E0"/>
</w:style>
</w:styles>
Schema excerpt
<xsd:complexType name="CT_Styles">
<xsd:sequence>
<xsd:element name="docDefaults" type="CT_DocDefaults" minOccurs="0"/>
<xsd:element name="latentStyles" type="CT_LatentStyles" minOccurs="0"/>
<xsd:element name="style"
type="CT_Style"
minOccurs="0" maxOccurs="unbounded"/>
</xsd:sequence>
</xsd:complexType>
<xsd:complexType name="CT_DocDefaults">
<xsd:sequence>
<xsd:element name="rPrDefault" type="CT_RPrDefault" minOccurs="0"/>
<xsd:element name="pPrDefault" type="CT_PPrDefault" minOccurs="0"/>
</xsd:sequence>
</xsd:complexType>
<xsd:complexType name="CT_LatentStyles">
<xsd:sequence>
<xsd:element name="lsdException" type="CT_LsdException" minOccurs="0" maxOccurs="unbounded"/>
4.1. Analysis
87
</xsd:sequence>
<xsd:attribute name="defLockedState"
<xsd:attribute name="defUIPriority"
<xsd:attribute name="defSemiHidden"
<xsd:attribute name="defUnhideWhenUsed"
<xsd:attribute name="defQFormat"
<xsd:attribute name="count"
</xsd:complexType>
type="s:ST_OnOff"/>
type="ST_DecimalNumber"/>
type="s:ST_OnOff"/>
type="s:ST_OnOff"/>
type="s:ST_OnOff"/>
type="ST_DecimalNumber"/>
<xsd:complexType name="CT_Style">
<xsd:sequence>
<xsd:element name="name"
type="CT_String"
<xsd:element name="aliases"
type="CT_String"
<xsd:element name="basedOn"
type="CT_String"
<xsd:element name="next"
type="CT_String"
<xsd:element name="link"
type="CT_String"
<xsd:element name="autoRedefine"
type="CT_OnOff"
<xsd:element name="hidden"
type="CT_OnOff"
<xsd:element name="uiPriority"
type="CT_DecimalNumber"
<xsd:element name="semiHidden"
type="CT_OnOff"
<xsd:element name="unhideWhenUsed" type="CT_OnOff"
<xsd:element name="qFormat"
type="CT_OnOff"
<xsd:element name="locked"
type="CT_OnOff"
<xsd:element name="personal"
type="CT_OnOff"
<xsd:element name="personalCompose" type="CT_OnOff"
<xsd:element name="personalReply"
type="CT_OnOff"
<xsd:element name="rsid"
type="CT_LongHexNumber"
<xsd:element name="pPr"
type="CT_PPrGeneral"
<xsd:element name="rPr"
type="CT_RPr"
<xsd:element name="tblPr"
type="CT_TblPrBase"
<xsd:element name="trPr"
type="CT_TrPr"
<xsd:element name="tcPr"
type="CT_TcPr"
<xsd:element name="tblStylePr"
type="CT_TblStylePr"
</xsd:sequence>
<xsd:attribute name="type"
type="ST_StyleType"/>
<xsd:attribute name="styleId"
type="s:ST_String"/>
<xsd:attribute name="default"
type="s:ST_OnOff"/>
<xsd:attribute name="customStyle" type="s:ST_OnOff"/>
</xsd:complexType>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0" maxOccurs="unbounded"/>
<xsd:complexType name="CT_OnOff">
<xsd:attribute name="val" type="s:ST_OnOff"/>
</xsd:complexType>
<xsd:complexType name="CT_String">
<xsd:attribute name="val" type="s:ST_String" use="required"/>
</xsd:complexType>
<xsd:simpleType name="ST_OnOff">
<xsd:union memberTypes="xsd:boolean ST_OnOff1"/>
</xsd:simpleType>
<xsd:simpleType name="ST_OnOff1">
<xsd:restriction base="xsd:string">
<xsd:enumeration value="on"/>
<xsd:enumeration value="off"/>
</xsd:restriction>
</xsd:simpleType>
88
<xsd:simpleType name="ST_StyleType">
<xsd:restriction base="xsd:string">
<xsd:enumeration value="paragraph"/>
<xsd:enumeration value="character"/>
<xsd:enumeration value="table"/>
<xsd:enumeration value="numbering"/>
</xsd:restriction>
</xsd:simpleType>
Style objects
A style is one of four types; character, paragraph, table, or numbering. All style objects have behavioral properties
and formatting properties. The set of formatting properties varies depending on the style type. In general, formatting
properties are inherited along this hierarchy: character -> paragraph -> table. A numbering style has no formatting
properties and does not inherit.
Behavioral properties There are six behavior properties:
hidden Style operates to assign formatting properties, but does not appear in the UI under any circumstances. Used
for internal styles assigned by an application that should not be under the control of an end-user.
priority Determines the sort order of the style in sequences presented by the UI.
semi-hidden The style is hidden from the so-called main user interface. In Word this means the recommended list
and the style gallery. The style still appears in the all styles list.
unhide_when_used Flag to the application to set semi-hidden False when the style is next used.
quick_style Show the style in the style gallery when it is not hidden.
locked Style is hidden and cannot be applied when document formatting protection is active.
hidden The hidden attribute doesnt work on built-in styles and its behavior on custom styles is spotty. Skipping this
attribute for now. Will reconsider if someone requests it and can provide a specific use case.
Behavior Scope. hidden doesnt work at all on Normal or Heading 1 style. It doesnt work on Salutation either.
There is no w:defHidden attribute on w:latentStyles, lending credence to the hypothesis it is not enabled for built-in
styles. Hypothesis: Doesnt work on built-in styles.
UI behavior. A custom style having w:hidden set True is hidden from the gallery and all styles pane lists. It does
however appear in the Current style of selected text box in the styles pane when the cursor is on a paragraph of that
style. The style can be modified by the user from this current style UI element. The user can assign a new style to a
paragraph having a hidden style.
priority The priority attribute is the integer primary sort key determining the position of a style in a UI list. The
secondary sort is alphabetical by name. Negative values are valid, although not assigned by Word itself and appear to
be treated as 0.
Behavior Default. Word behavior appears to default priority to 0 for custom styles. The spec indicates the effective
default value is conceptually infinity, such that the style appears at the end of the styles list, presumably alphabetically
among other styles having no priority assigned.
4.1. Analysis
89
Candidate protocol
>>> style = document.styles['Foobar']
>>> style.priority
None
>>> style.priority = 7
>>> style.priority
7
>>> style.priority = -42
>>> style.priority
0
semi-hidden The w:semiHidden element specifies visibility of the style in the so-called main user interface. For
Word, this means the style gallery and the recommended, styles-in-use, and in-current-document lists. The all-styles
list and current-style dropdown in the styles pane would then be considered part of an advanced user interface.
Behavior Default. If the w:semiHidden element is omitted, its effective value is False. There is no inheritance of
this value.
Scope. Works on both built-in and custom styles.
Word behavior. Word does not use the @w:val attribute. It writes <w:semiHidden/> for True and omits the element
for False.
Candidate protocol
>>> style = document.styles['Foo']
>>> style.hidden
False
>>> style.hidden = True
>>> style.hidden
True
style.hidden = False:
<w:style w:type="paragraph" w:styleId="Foo">
<w:name w:val="Foo"/>
</w:style>
Alternate constructions should also report the proper value but not be used when writing XML:
<w:style w:type="paragraph" w:styleId="Foo">
<w:name w:val="Foo"/>
<w:semiHidden w:val="0"/> <!-- style.hidden is False -->
</w:style>
<w:style w:type="paragraph" w:styleId="Foo">
<w:name w:val="Foo"/>
<w:semiHidden w:val="1"/> <!-- style.hidden is True -->
</w:style>
90
unhide-when-used The w:unhideWhenUsed element signals an application that this style should be made visibile
the next time it is used.
Behavior Default. If the w:unhideWhenUsed element is omitted, its effective value is False. There is no inheritance of this value.
Word behavior. The w:unhideWhenUsed element is not changed or removed when the style is next used. Only the
w:semiHidden element is affected, if present. Presumably this is so a style can be re-hidden, to be unhidden on the
subsequent use.
Note that this behavior in Word is only triggered by a user actually applying a style. Merely loading a document
having the style applied somewhere in its contents does not cause the w:semiHidden element to be removed.
Candidate protocol
>>> style = document.styles['Foo']
>>> style.unhide_when_used
False
>>> style.unhide_when_used = True
>>> style.unhide_when_used
True
style.unhide_when_used = False:
<w:style w:type="paragraph" w:styleId="Foo">
<w:name w:val="Foo"/>
</w:style>
Alternate constructions should also report the proper value but not be used when writing XML:
<w:style w:type="paragraph" w:styleId="Foo">
<w:name w:val="Foo"/>
<w:unhideWhenUsed w:val="0"/> <!-- style.unhide_when_used is False -->
</w:style>
<w:style w:type="paragraph" w:styleId="Foo">
<w:name w:val="Foo"/>
<w:unhideWhenUsed w:val="1"/> <!-- style.unhide_when_used is True -->
</w:style>
quick-style The w:qFormat element specifies whether Word should display this style in the style gallery. In order to
appear in the gallery, this attribute must be True and hidden must be False.
Behavior
value.
Default. If the w:qFormat element is omitted, its effective value is False. There is no inheritance of this
Word behavior. If w:qFormat is True and the style is not hidden, it will appear in the gallery in the order specified
by w:uiPriority.
4.1. Analysis
91
Candidate protocol
>>> style = document.styles['Foo']
>>> style.quick_style
False
>>> style.quick_style = True
>>> style.quick_style
True
style.quick_style = False:
<w:style w:type="paragraph" w:styleId="Foo">
<w:name w:val="Foo"/>
</w:style>
Alternate constructions should also report the proper value but not be used when writing XML:
<w:style w:type="paragraph" w:styleId="Foo">
<w:name w:val="Foo"/>
<w:qFormat w:val="0"/> <!-- style.quick_style is False -->
</w:style>
<w:style w:type="paragraph" w:styleId="Foo">
<w:name w:val="Foo"/>
<w:qFormat w:val="1"/> <!-- style.quick_style is True -->
</w:style>
locked The w:locked element specifies whether Word should prevent this style from being applied to content. This
behavior is only active if formatting protection is turned on.
Behavior
value.
Default. If the w:locked element is omitted, its effective value is False. There is no inheritance of this
Candidate protocol
>>> style = document.styles['Foo']
>>> style.locked
False
>>> style.locked = True
>>> style.locked
True
92
style.locked = False:
<w:style w:type="paragraph" w:styleId="Foo">
<w:name w:val="Foo"/>
</w:style>
Alternate constructions should also report the proper value but not be used when writing XML:
<w:style w:type="paragraph" w:styleId="Foo">
<w:name w:val="Foo"/>
<w:locked w:val="0"/> <!-- style.locked is False -->
</w:style>
<w:style w:type="paragraph" w:styleId="Foo">
<w:name w:val="Foo"/>
<w:locked w:val="1"/> <!-- style.locked is True -->
</w:style>
delete():
>>> len(styles)
6
>>> style.delete()
>>> len(styles)
5
>>> styles['Citation']
KeyError: no style with id or name 'Citation'
Style.base_style:
>>> style = styles.add_style('Citation', WD_STYLE_TYPE.PARAGRAPH)
>>> style.base_style
None
>>> style.base_style = styles['Normal']
>>> style.base_style
<docx.styles.style._ParagraphStyle object at 0x10a7a9550>
>>> style.base_style.name
'Normal'
Example XML
<w:styles>
<!-- ... -->
<w:style w:type="paragraph" w:default="1" w:styleId="Normal">
<w:name w:val="Normal"/>
<w:qFormat/>
</w:style>
4.1. Analysis
93
Schema excerpt
<xsd:complexType
<xsd:sequence>
<xsd:element
<xsd:element
<xsd:element
<xsd:element
<xsd:element
<xsd:element
<xsd:element
<xsd:element
<xsd:element
<xsd:element
<xsd:element
<xsd:element
<xsd:element
<xsd:element
<xsd:element
<xsd:element
<xsd:element
<xsd:element
94
name="CT_Style">
name="name"
name="aliases"
name="basedOn"
name="next"
name="link"
name="autoRedefine"
name="hidden"
name="uiPriority"
name="semiHidden"
name="unhideWhenUsed"
name="qFormat"
name="locked"
name="personal"
name="personalCompose"
name="personalReply"
name="rsid"
name="pPr"
name="rPr"
type="CT_String"
type="CT_String"
type="CT_String"
type="CT_String"
type="CT_String"
type="CT_OnOff"
type="CT_OnOff"
type="CT_DecimalNumber"
type="CT_OnOff"
type="CT_OnOff"
type="CT_OnOff"
type="CT_OnOff"
type="CT_OnOff"
type="CT_OnOff"
type="CT_OnOff"
type="CT_LongHexNumber"
type="CT_PPrGeneral"
type="CT_RPr"
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
<xsd:element name="tblPr"
<xsd:element name="trPr"
<xsd:element name="tcPr"
<xsd:element name="tblStylePr"
</xsd:sequence>
<xsd:attribute name="type"
<xsd:attribute name="styleId"
<xsd:attribute name="default"
<xsd:attribute name="customStyle"
</xsd:complexType>
type="CT_TblPrBase"
type="CT_TrPr"
type="CT_TcPr"
type="CT_TblStylePr"
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0" maxOccurs="unbounded"/>
type="ST_StyleType"/>
type="s:ST_String"/>
type="s:ST_OnOff"/>
type="s:ST_OnOff"/>
<xsd:complexType name="CT_OnOff">
<xsd:attribute name="val" type="s:ST_OnOff"/>
</xsd:complexType>
<xsd:complexType name="CT_String">
<xsd:attribute name="val" type="s:ST_String" use="required"/>
</xsd:complexType>
<xsd:simpleType name="ST_OnOff">
<xsd:union memberTypes="xsd:boolean ST_OnOff1"/>
</xsd:simpleType>
<xsd:simpleType name="ST_OnOff1">
<xsd:restriction base="xsd:string">
<xsd:enumeration value="on"/>
<xsd:enumeration value="off"/>
</xsd:restriction>
</xsd:simpleType>
<xsd:simpleType name="ST_StyleType">
<xsd:restriction base="xsd:string">
<xsd:enumeration value="paragraph"/>
<xsd:enumeration value="character"/>
<xsd:enumeration value="table"/>
<xsd:enumeration value="numbering"/>
</xsd:restriction>
</xsd:simpleType>
Paragraph Style
A paragraph style provides character formatting (font) as well as paragraph formatting properties. Character formatting is inherited from _CharacterStyle and is predominantly embodied in the font property. Likewise, most
paragraph-specific properties come from the ParagraphFormat object available on the paragraph_format
property.
A handful of other properties are specific to a paragraph style.
next_paragraph_style The next_paragraph_style property provides access to the style that will automatically be
assigned by Word to a new paragraph inserted after a paragraph with this style. This property is most useful for a style
that would normally appear only once in a sequence, such as a heading.
The default is to use the same style for an inserted paragraph. This addresses the most common case; for example, a
body paragraph having Body Text style would normally be followed by a paragraph of the same style.
4.1. Analysis
95
Expected usage The priority use case for this property is to provide a working style that can be assigned to a
paragraph. The property will always provide a valid paragraph style, defaulting to the current style whenever a more
specific one cannot be determined.
While this obscures some specifics of the situation from the API, it addresses the expected most common use case.
Developers needing to detect, for example, missing styles can readily use the oxml layer to inspect the XML and
further features can be added if those use cases turn out to be more common than expected.
Behavior
Default. The default next paragraph style is the same paragraph style.
The default is used whenever the next paragraph style is not specified or is invalid, including these conditions:
No w:next child element is present
A style having the styleId specified in w:next/@w:val is not present in the document.
The style specified in w:next/@w:val is not a paragraph style.
In all these cases the current style (self ) is returned.
Example XML paragraph_style.next_paragraph_style is styles[Bar]:
<w:style w:type="paragraph" w:styleId="Foo">
<w:name w:val="Foo"/>
<w:next w:val="Bar"/>
</w:style>
Schema excerpt
<xsd:complexType name="CT_Style">
<xsd:sequence>
<xsd:element name="name"
<xsd:element name="aliases"
96
type="CT_String"
type="CT_String"
minOccurs="0"/>
minOccurs="0"/>
<xsd:element name="basedOn"
type="CT_String"
<xsd:element name="next"
type="CT_String"
<xsd:element name="link"
type="CT_String"
<xsd:element name="autoRedefine"
type="CT_OnOff"
<xsd:element name="hidden"
type="CT_OnOff"
<xsd:element name="uiPriority"
type="CT_DecimalNumber"
<xsd:element name="semiHidden"
type="CT_OnOff"
<xsd:element name="unhideWhenUsed" type="CT_OnOff"
<xsd:element name="qFormat"
type="CT_OnOff"
<xsd:element name="locked"
type="CT_OnOff"
<xsd:element name="personal"
type="CT_OnOff"
<xsd:element name="personalCompose" type="CT_OnOff"
<xsd:element name="personalReply"
type="CT_OnOff"
<xsd:element name="rsid"
type="CT_LongHexNumber"
<xsd:element name="pPr"
type="CT_PPrGeneral"
<xsd:element name="rPr"
type="CT_RPr"
<xsd:element name="tblPr"
type="CT_TblPrBase"
<xsd:element name="trPr"
type="CT_TrPr"
<xsd:element name="tcPr"
type="CT_TcPr"
<xsd:element name="tblStylePr"
type="CT_TblStylePr"
</xsd:sequence>
<xsd:attribute name="type"
type="ST_StyleType"/>
<xsd:attribute name="styleId"
type="s:ST_String"/>
<xsd:attribute name="default"
type="s:ST_OnOff"/>
<xsd:attribute name="customStyle" type="s:ST_OnOff"/>
</xsd:complexType>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0" maxOccurs="unbounded"/>
<xsd:complexType name="CT_String">
<xsd:attribute name="val" type="s:ST_String" use="required"/>
</xsd:complexType>
Character Style
Word allows a set of run-level properties to be given a name. The set of properties is called a character style. All the
settings may be applied to a run in a single action by setting the style of the run.
Protocol There are two call protocols related to character style: getting and setting the character style of a run, and
specifying a style when creating a run.
Get run style:
>>> run = p.add_run()
>>> run.style
<docx.styles.style._CharacterStyle object at 0x1053ab5d0>
>>> run.style.name
'Default Paragraph Font'
4.1. Analysis
97
Assigning None to Run.style causes any applied character style to be removed. A run without a character style
inherits the default character style of the document:
>>> run.style = None
>>> run.style.name
'Default Paragraph Font'
A style that appears in the Word user interface (UI) with one or more spaces in its name, such as Subtle Emphasis,
will generally have a style ID with those spaces removed. In this example, Subtle Emphasis becomes SubtleEmphasis:
<w:p>
<w:r>
<w:rPr>
<w:rStyle w:val="SubtleEmphasis"/>
</w:rPr>
<w:t>a few words in Subtle Emphasis style</w:t>
</w:r>
</w:p>
Schema excerpt
<xsd:complexType name="CT_R"> <!-- flattened for readibility -->
<xsd:sequence>
<xsd:element name="rPr" type="CT_RPr" minOccurs="0"/>
<xsd:group
ref="EG_RunInnerContent" minOccurs="0" maxOccurs="unbounded"/>
</xsd:sequence>
<xsd:attribute name="rsidRPr" type="ST_LongHexNumber"/>
<xsd:attribute name="rsidDel" type="ST_LongHexNumber"/>
98
<xsd:attribute name="rsidR"
</xsd:complexType>
type="ST_LongHexNumber"/>
4.1. Analysis
99
Latent Styles
Latent style definitions are a stub style definition specifying behavioral (UI display) attributes for built-in styles.
Latent style collection The latent style collection for a document is accessed using the latent_styles property
on Styles:
>>> latent_styles = document.styles.latent_styles
>>> latent_styles
<docx.styles.LatentStyles object at 0x1045dd550>
Iteration. LatentStyles should support iteration of contained _LatentStyle objects in document order.
Latent style access. A latent style can be accessed by name using dictionary-style notation.
len(). LatentStyles supports len(), reporting the number of _LatentStyle objects it contains.
LatentStyles properties
default_priority XML semantics. According to ISO 29500, the default value if the w:defUIPriority attribute is
omitted is 99. 99 is explictly set in the default Word styles.xml, so will generally be what one finds.
Protocol:
>>> # return None if attribute is omitted
>>> latent_styles.default_priority
None
>>> # but expect is will almost always be explicitly 99
>>> latent_styles.default_priority
99
>>> latent_styles.default_priority = 42
>>> latent_styles.default_priority
42
load_count
XML semantics. No default is stated in the spec. Dont allow assignment of None.
Protocol:
>>> latent_styles.load_count
276
>>> latent_styles.load_count = 242
>>> latent_styles.load_count
242
Boolean properties There are four boolean properties that all share the same protocol:
default_to_hidden
default_to_locked
default_to_quick_style
default_to_unhide_when_used
XML semantics. Defaults to False if the attribute is omitted. However, the attribute should always be written
explicitly on update.
Protocol:
100
>>> latent_styles.default_to_hidden
False
>>> latent_styles.default_to_hidden = True
>>> latent_styles.default_to_hidden
True
Specimen XML The w:latentStyles element used in the default Word 2011 template:
<w:latentStyles w:defLockedState="0" w:defUIPriority="99" w:defSemiHidden="1"
w:defUnhideWhenUsed="1" w:defQFormat="0" w:count="276">
_LatentStyle properties
>>> latent_style = latent_styles.latent_styles[0]
>>> latent_style.name
'Normal'
>>> latent_style.priority
None
>>> latent_style.priority = 10
>>> latent_style.priority
10
>>> latent_style.locked
None
>>> latent_style.locked = True
>>> latent_style.locked
True
>>> latent_style.quick_style
None
>>> latent_style.quick_style = True
>>> latent_style.quick_style
True
4.1. Analysis
101
uiPriority. The uiPriority attribute acts as a sort key for sequencing style names in the user interface. Both the
lists in the styles panel and the Style Gallery are sensitive to this setting. Its effective value is 0 if not specified.
semiHidden. The semiHidden attribute causes the style to be excluded from the recommended list. The notion
of semi in this context is that while the style is hidden from the recommended list, it still appears in the All
Styles list. This attribute is removed on first application of the style if an unhideWhenUsed attribute set True
is also present.
unhideWhenUsed. The unhideWhenUsed attribute causes any semiHidden attribute to be removed when the
style is first applied to content. Word does not remove the semiHidden attribute just because there exists an object
in the document having that style. The unhideWhenUsed attribute is not removed along with the semiHidden
attribute when the style is applied.
The semiHidden and unhideWhenUsed attributes operate in combination to produce hide-until-used behavior.
Hypothesis. The persistance of the unhideWhenUsed attribute after removing the semiHidden attribute on first
application of the style is necessary to produce appropriate behavior in style inheritance situations. In that case,
the semiHidden attribute may be explictly set to False to override an inherited value. Or it could allow the
semiHidden attribute to be re-set to True later while preserving the hide-until-used behavior.
qFormat. The qFormat attribute specifies whether the style should appear in the Style Gallery when it appears
in the recommended list. A style will never appear in the gallery unless it also appears in the recommended list.
Latent style attributes are only operative for latent styles. Once a style is defined, the attributes of the definition
exclusively determine style behavior; no attributes are inherited from its corresponding latent style definition.
Specimen XML
<w:latentStyles w:defLockedState="0" w:defUIPriority="99" w:defSemiHidden="1"
w:defUnhideWhenUsed="1" w:defQFormat="0" w:count="276">
<w:lsdException w:name="Normal" w:semiHidden="0" w:uiPriority="0"
w:unhideWhenUsed="0" w:qFormat="1"/>
<w:lsdException w:name="heading 1" w:semiHidden="0" w:uiPriority="9"
w:unhideWhenUsed="0" w:qFormat="1"/>
<w:lsdException w:name="caption" w:uiPriority="35" w:qFormat="1"/>
<w:lsdException w:name="Default Paragraph Font" w:uiPriority="1"/>
<w:lsdException w:name="Bibliography" w:uiPriority="37"/>
<w:lsdException w:name="TOC Heading" w:uiPriority="39" w:qFormat="1"/>
</w:latentStyles>
Schema excerpt
<xsd:complexType name="CT_Styles">
<xsd:sequence>
<xsd:element name="docDefaults" type="CT_DocDefaults" minOccurs="0"/>
<xsd:element name="latentStyles" type="CT_LatentStyles" minOccurs="0"/>
<xsd:element name="style"
type="CT_Style"
minOccurs="0" maxOccurs="unbounded"/>
</xsd:sequence>
</xsd:complexType>
<xsd:complexType name="CT_LatentStyles">
<xsd:sequence>
<xsd:element name="lsdException" type="CT_LsdException" minOccurs="0" maxOccurs="unbounded"/>
</xsd:sequence>
<xsd:attribute name="defLockedState"
type="s:ST_OnOff"/>
<xsd:attribute name="defUIPriority"
type="ST_DecimalNumber"/>
<xsd:attribute name="defSemiHidden"
type="s:ST_OnOff"/>
<xsd:attribute name="defUnhideWhenUsed" type="s:ST_OnOff"/>
<xsd:attribute name="defQFormat"
type="s:ST_OnOff"/>
102
<xsd:attribute name="count"
</xsd:complexType>
type="ST_DecimalNumber"/>
<xsd:complexType name="CT_LsdException">
<xsd:attribute name="name"
type="s:ST_String"
use="required"/>
<xsd:attribute name="locked"
type="s:ST_OnOff"/>
<xsd:attribute name="uiPriority"
type="ST_DecimalNumber"/>
<xsd:attribute name="semiHidden"
type="s:ST_OnOff"/>
<xsd:attribute name="unhideWhenUsed" type="s:ST_OnOff"/>
<xsd:attribute name="qFormat"
type="s:ST_OnOff"/>
</xsd:complexType>
<xsd:complexType name="CT_OnOff">
<xsd:attribute name="val" type="s:ST_OnOff"/>
</xsd:complexType>
<xsd:complexType name="CT_String">
<xsd:attribute name="val" type="s:ST_String" use="required"/>
</xsd:complexType>
<xsd:simpleType name="ST_OnOff">
<xsd:union memberTypes="xsd:boolean ST_OnOff1"/>
</xsd:simpleType>
<xsd:simpleType name="ST_OnOff1">
<xsd:restriction base="xsd:string">
<xsd:enumeration value="on"/>
<xsd:enumeration value="off"/>
</xsd:restriction>
</xsd:simpleType>
Word supports the definition of styles to allow a group of formatting properties to be easily and consistently applied to
a paragraph, run, table, or numbering scheme, all at once. The mechanism is similar to how Cascading Style Sheets
(CSS) works with HTML.
Styles are defined in the styles.xml package part and are keyed to a paragraph, run, or table using the styleId
string.
Style visual behavior
Sort order. Built-in styles appear in order of the effective value of their uiPriority attribute. By default, a custom
style will not receive a uiPriority attribute, causing its effective value to default to 0. This will generlly place
custom styles at the top of the sort order. A set of styles having the same uiPriority value will be sub-sorted in
alphabetical order.
If a uiPriority attribute is defined for a custom style, that style is interleaved with the built-in styles, according
to their uiPriority value. The uiPriority attribute takes a signed integer, and accepts negative numbers. Note
that Word does not allow the use of negative integers via its UI; rather it allows the uiPriority number of built-in
types to be increased to produce the desired sorting behavior.
Identification. A style is identified by its name, not its styleId attribute. The styleId is used only for internal
linking of an object like a paragraph to a style. The styleId may be changed by the application, and in fact is
routinely changed by Word on each save to be a transformation of the name.
Hypothesis. Word calculates the styleId by removing all spaces from the style name.
List membership. There are four style list options in the styles panel:
4.1. Analysis
103
Recommended. The recommended list contains all latent and defined styles that have semiHidden ==
False.
Styles in Use. The styles-in-use list contains all styles that have been applied to content in the document
(implying they are defined) that also have semiHidden == False.
In Current Document. The in-current-document list contains all defined styles in the document having
semiHidden == False.
All Styles. The all-styles list contains all latent and defined styles in the document.
Definition of built-in style. When a built-in style is added to a document (upon first use), the value of each of
the locked, uiPriority and qFormat attributes from its latent style definition (the latentStyles attributes overridden
by those of any lsdException element) is used to override the corresponding value in the inserted style definition
from their built-in defaults.
Each built-in style has default attributes that can be revealed by setting the latentStyles/@count attribute to 0
and inspecting the style in the style manager. This may include default behavioral properties.
Anomaly. Style No Spacing does not appear in the recommended list even though its behavioral attributes
indicate it should. (Google indicates it may be a legacy style from Word 2003).
Word has 267 built-in styles, listed here: http://www.thedoctools.com/downloads/DocTools_List_Of_Builtin_Style_English_Danish_German_French.pdf
Note that at least one other sources has the number at 276 rather than 267.
Appearance in the Style Gallery. A style appears in the style gallery when: semiHidden == False and
qFormat == True
Glossary
built-in style One of a set of standard styles known to Word, such as Heading 1. Built-in styles are presented in
Words style panel whether or not they are actually defined in the styles part.
latent style A built-in style having no definition in a particular document is known as a latent style in that document.
style definition A <w:style> element in the styles part that explicitly defines the attributes of a style.
recommended style list A list of styles that appears in the styles toolbox or panel when Recommended is selected
from the List: dropdown box.
Word behavior
If no style having an assigned style id is defined in the styles part, the style application has no effect.
Word does not add a formatting definition (<w:style> element) for a built-in style until it is used.
Once present in the styles part, Word does not remove a built-in style definition if it is no longer applied to any content.
The definition of each of the styles ever used in a document are accumulated in its styles.xml.
Related MS API (partial)
Document.Styles
Styles.Add, .Item, .Count, access by name, e.g. Styles(Foobar)
Style.BaseStyle
Style.Builtin
104
Style.Delete()
Style.Description
Style.Font
Style.Linked
Style.LinkStyle
Style.LinkToListTemplate()
Style.ListLevelNumber
Style.ListTemplate
Style.Locked
Style.NameLocal
Style.NameParagraphStyle
Style.NoSpaceBetweenParagraphsOfSameStyle
Style.ParagraphFormat
Style.Priority
Style.QuickStyle
Style.Shading
Style.Table(Style)
Style.Type
Style.UnhideWhenUsed
Style.Visibility
Enumerations
WdBuiltinStyle
Example XML
<?xml version='1.0' encoding='UTF-8' standalone='yes'?>
<w:styles
xmlns:mc="http://schemas.openxmlformats.org/markup-compatibility/2006"
xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships"
xmlns:w14="http://schemas.microsoft.com/office/word/2010/wordml"
xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main"
mc:Ignorable="w14"
>
<w:docDefaults>
<w:rPrDefault>
<w:rPr>
<w:rFonts w:asciiTheme="minorHAnsi" w:eastAsiaTheme="minorEastAsia"
w:hAnsiTheme="minorHAnsi" w:cstheme="minorBidi"/>
<w:sz w:val="24"/>
<w:szCs w:val="24"/>
<w:lang w:val="en-US" w:eastAsia="en-US" w:bidi="ar-SA"/>
</w:rPr>
4.1. Analysis
105
</w:rPrDefault>
<w:pPrDefault/>
</w:docDefaults>
<w:latentStyles w:defLockedState="0" w:defUIPriority="99" w:defSemiHidden="1"
w:defUnhideWhenUsed="1" w:defQFormat="0" w:count="276">
<w:lsdException w:name="Normal" w:semiHidden="0" w:uiPriority="0"
w:unhideWhenUsed="0" w:qFormat="1"/>
<w:lsdException w:name="heading 1" w:semiHidden="0" w:uiPriority="9"
w:unhideWhenUsed="0" w:qFormat="1"/>
<w:lsdException w:name="heading 2" w:uiPriority="9" w:qFormat="1"/>
<w:lsdException w:name="Default Paragraph Font" w:uiPriority="1"/>
</w:latentStyles>
<w:style w:type="paragraph" w:default="1" w:styleId="Normal">
<w:name w:val="Normal"/>
<w:qFormat/>
</w:style>
<w:style w:type="character" w:default="1" w:styleId="DefaultParagraphFont">
<w:name w:val="Default Paragraph Font"/>
<w:uiPriority w:val="1"/>
<w:semiHidden/>
<w:unhideWhenUsed/>
</w:style>
<w:style w:type="table" w:default="1" w:styleId="TableNormal">
<w:name w:val="Normal Table"/>
<w:uiPriority w:val="99"/>
<w:semiHidden/>
<w:unhideWhenUsed/>
<w:tblPr>
<w:tblInd w:w="0" w:type="dxa"/>
<w:tblCellMar>
<w:top w:w="0" w:type="dxa"/>
<w:left w:w="108" w:type="dxa"/>
<w:bottom w:w="0" w:type="dxa"/>
<w:right w:w="108" w:type="dxa"/>
</w:tblCellMar>
</w:tblPr>
</w:style>
<w:style w:type="numbering" w:default="1" w:styleId="NoList">
<w:name w:val="No List"/>
<w:uiPriority w:val="99"/>
<w:semiHidden/>
<w:unhideWhenUsed/>
</w:style>
<w:style w:type="paragraph" w:customStyle="1" w:styleId="Foobar">
<w:name w:val="Foobar"/>
<w:qFormat/>
<w:rsid w:val="004B54E0"/>
</w:style>
</w:styles>
Schema excerpt
<xsd:complexType name="CT_Styles">
<xsd:sequence>
<xsd:element name="docDefaults" type="CT_DocDefaults" minOccurs="0"/>
<xsd:element name="latentStyles" type="CT_LatentStyles" minOccurs="0"/>
106
<xsd:element name="style"
</xsd:sequence>
</xsd:complexType>
type="CT_Style"
minOccurs="0" maxOccurs="unbounded"/>
<xsd:complexType name="CT_DocDefaults">
<xsd:sequence>
<xsd:element name="rPrDefault" type="CT_RPrDefault" minOccurs="0"/>
<xsd:element name="pPrDefault" type="CT_PPrDefault" minOccurs="0"/>
</xsd:sequence>
</xsd:complexType>
<xsd:complexType name="CT_LatentStyles">
<xsd:sequence>
<xsd:element name="lsdException" type="CT_LsdException" minOccurs="0" maxOccurs="unbounded"/>
</xsd:sequence>
<xsd:attribute name="defLockedState"
type="s:ST_OnOff"/>
<xsd:attribute name="defUIPriority"
type="ST_DecimalNumber"/>
<xsd:attribute name="defSemiHidden"
type="s:ST_OnOff"/>
<xsd:attribute name="defUnhideWhenUsed" type="s:ST_OnOff"/>
<xsd:attribute name="defQFormat"
type="s:ST_OnOff"/>
<xsd:attribute name="count"
type="ST_DecimalNumber"/>
</xsd:complexType>
<xsd:complexType name="CT_Style">
<xsd:sequence>
<xsd:element name="name"
type="CT_String"
<xsd:element name="aliases"
type="CT_String"
<xsd:element name="basedOn"
type="CT_String"
<xsd:element name="next"
type="CT_String"
<xsd:element name="link"
type="CT_String"
<xsd:element name="autoRedefine"
type="CT_OnOff"
<xsd:element name="hidden"
type="CT_OnOff"
<xsd:element name="uiPriority"
type="CT_DecimalNumber"
<xsd:element name="semiHidden"
type="CT_OnOff"
<xsd:element name="unhideWhenUsed" type="CT_OnOff"
<xsd:element name="qFormat"
type="CT_OnOff"
<xsd:element name="locked"
type="CT_OnOff"
<xsd:element name="personal"
type="CT_OnOff"
<xsd:element name="personalCompose" type="CT_OnOff"
<xsd:element name="personalReply"
type="CT_OnOff"
<xsd:element name="rsid"
type="CT_LongHexNumber"
<xsd:element name="pPr"
type="CT_PPrGeneral"
<xsd:element name="rPr"
type="CT_RPr"
<xsd:element name="tblPr"
type="CT_TblPrBase"
<xsd:element name="trPr"
type="CT_TrPr"
<xsd:element name="tcPr"
type="CT_TcPr"
<xsd:element name="tblStylePr"
type="CT_TblStylePr"
</xsd:sequence>
<xsd:attribute name="type"
type="ST_StyleType"/>
<xsd:attribute name="styleId"
type="s:ST_String"/>
<xsd:attribute name="default"
type="s:ST_OnOff"/>
<xsd:attribute name="customStyle" type="s:ST_OnOff"/>
</xsd:complexType>
<xsd:complexType
<xsd:attribute
<xsd:attribute
<xsd:attribute
4.1. Analysis
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0" maxOccurs="unbounded"/>
name="CT_LsdException">
name="name"
type="s:ST_String"
use="required"/>
name="locked"
type="s:ST_OnOff"/>
name="uiPriority"
type="ST_DecimalNumber"/>
107
<xsd:attribute name="semiHidden"
type="s:ST_OnOff"/>
<xsd:attribute name="unhideWhenUsed" type="s:ST_OnOff"/>
<xsd:attribute name="qFormat"
type="s:ST_OnOff"/>
</xsd:complexType>
<xsd:complexType name="CT_OnOff">
<xsd:attribute name="val" type="s:ST_OnOff"/>
</xsd:complexType>
<xsd:complexType name="CT_String">
<xsd:attribute name="val" type="s:ST_String" use="required"/>
</xsd:complexType>
<xsd:simpleType name="ST_OnOff">
<xsd:union memberTypes="xsd:boolean ST_OnOff1"/>
</xsd:simpleType>
<xsd:simpleType name="ST_OnOff1">
<xsd:restriction base="xsd:string">
<xsd:enumeration value="on"/>
<xsd:enumeration value="off"/>
</xsd:restriction>
</xsd:simpleType>
<xsd:simpleType name="ST_StyleType">
<xsd:restriction base="xsd:string">
<xsd:enumeration value="paragraph"/>
<xsd:enumeration value="character"/>
<xsd:enumeration value="table"/>
<xsd:enumeration value="numbering"/>
</xsd:restriction>
</xsd:simpleType>
108
Properties
15 properties are supported. All unicode values are limited to 255 characters (not bytes).
author (unicode) Note: named creator in spec. An entity primarily responsible for making the content of the
resource. (Dublin Core)
category (unicode) A categorization of the content of this package. Example values for this property might include:
Resume, Letter, Financial Forecast, Proposal, Technical Presentation, and so on. (Open Packaging Conventions)
comments (unicode) Note: named description in spec. An explanation of the content of the resource. Values might
include an abstract, table of contents, reference to a graphical representation of content, and a free-text account
of the content. (Dublin Core)
content_status (unicode) The status of the content. Values might include Draft, Reviewed, and Final. (Open
Packaging Conventions)
created (datetime) Date of creation of the resource. (Dublin Core)
identifier (unicode) An unambiguous reference to the resource within a given context. (Dublin Core)
keywords (unicode) A delimited set of keywords to support searching and indexing. This is typically a list of terms
that are not available elsewhere in the properties. (Open Packaging Conventions)
language (unicode) The language of the intellectual content of the resource. (Dublin Core)
last_modified_by (unicode) The user who performed the last modification. The identification is environmentspecific. Examples include a name, email address, or employee ID. It is recommended that this value be as
concise as possible. (Open Packaging Conventions)
last_printed (datetime) The date and time of the last printing. (Open Packaging Conventions)
modified (datetime) Date on which the resource was changed. (Dublin Core)
revision (int) The revision number. This value might indicate the number of saves or revisions, provided the application updates it after each revision. (Open Packaging Conventions)
subject (unicode) The topic of the content of the resource. (Dublin Core)
title (unicode) The name given to the resource. (Dublin Core)
version (unicode) The version designator. This value is set by the user or by the application. (Open Packaging
Conventions)
Specimen XML
4.1. Analysis
109
Schema Excerpt
<xs:schema
targetNamespace="http://schemas.openxmlformats.org/package/2006/metadata/core-properties"
xmlns="http://schemas.openxmlformats.org/package/2006/metadata/core-properties"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:dcterms="http://purl.org/dc/terms/"
elementFormDefault="qualified"
blockDefault="#all">
<xs:import
namespace="http://purl.org/dc/elements/1.1/"
schemaLocation="http://dublincore.org/schemas/xmls/qdc/2003/04/02/dc.xsd"/>
<xs:import
namespace="http://purl.org/dc/terms/"
schemaLocation="http://dublincore.org/schemas/xmls/qdc/2003/04/02/dcterms.xsd"/>
<xs:import
id="xml"
namespace="http://www.w3.org/XML/1998/namespace"/>
<xs:element name="coreProperties" type="CT_CoreProperties"/>
<xs:complexType name="CT_CoreProperties">
<xs:all>
<xs:element name="category"
type="xs:string"
<xs:element name="contentStatus"
type="xs:string"
<xs:element ref="dcterms:created"
<xs:element ref="dc:creator"
<xs:element ref="dc:description"
<xs:element ref="dc:identifier"
<xs:element name="keywords"
type="CT_Keywords"
<xs:element ref="dc:language"
<xs:element name="lastModifiedBy" type="xs:string"
<xs:element name="lastPrinted"
type="xs:dateTime"
<xs:element ref="dcterms:modified"
<xs:element name="revision"
type="xs:string"
<xs:element ref="dc:subject"
<xs:element ref="dc:title"
<xs:element name="version"
type="xs:string"
</xs:all>
</xs:complexType>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
110
Diagrams like the one below are used to depict tables in this analysis. Horizontal spans are depicted as a continuous
horizontal cell without vertical dividers within the span. Vertical spans are depicted as a vertical sequence of cells of
the same width where continuation cells are separated by a dashed top border and contain a caret (^) to symbolize
the continuation of the cell above. Cell addresses are depicted at the column and row grid lines. This is conceptually
convenient as it reuses the notion of list indices (and slices) and makes certain operations more intuitive to specify.
The merged cell A below has top, left, bottom, and right values of 0, 0, 2, and 2 respectively:
\ 0
1
2
3
0 +---+---+---+
| A
|
|
1 + - - - +---+
| ^
|
|
2 +---+---+---+
|
|
|
|
3 +---+---+---+
4.1. Analysis
111
True
>>> table.columns[1].cells[1] == middle_cell
True
table = document.add_table(3, 3)
a = table.cells(0, 0)
b = table.cells(1, 1)
A = a.merge(b)
\ 0
1
2
3
0 +---+---+---+
| a |
|
|
1 +---+---+---+
|
| b |
|
2 +---+---+---+
|
|
|
|
3 +---+---+---+
-->
+---+---+---+
| A
|
|
+ - - - +---+
| ^
|
|
+---+---+---+
|
|
|
|
+---+---+---+
A cell is accessed by its layout grid position regardless of any spans that may be present. A grid address that falls in
a span returns the top-leftmost cell in that span. This means a span has as many addresses as layout grid cells it spans.
For example, the merged cell A above can be addressed as (0, 0), (0, 1), (1, 0), or (1, 1). This addressing scheme leads
to desirable access behaviors when spans are present in the table.
The length of Row.cells is always equal to the number of grid columns, regardless of any spans that are present.
Likewise, the length of Column.cells is always equal to the number of table rows, regardless of any spans.
>>> table = document.add_table(2, 3)
>>> row = table.rows[0]
>>> len(row.cells)
3
>>> row.cells[0] == row.cells[1]
False
>>> a, b = row.cells[:2]
>>> a.merge(b)
>>> len(row.cells)
3
>>> row.cells[0] == row.cells[1]
True
\ 0
1
2
3
0 +---+---+---+
| a | b |
|
1 +---+---+---+
|
|
|
|
2 +---+---+---+
112
-->
+---+---+---+
| A
|
|
+---+---+---+
|
|
|
|
+---+---+---+
When two or more cells are merged, any existing content is concatenated and placed in the resulting merged cell.
Content from each original cell is separated from that in the prior original cell by a paragraph mark. An original cell
having no content is skipped in the contatenation process. In Python, the procedure would look roughly like this:
merged_cell_text = '\n'.join(
cell.text for cell in original_cells if cell.text
)
Merging four cells with content a, b, , and d respectively results in a merged cell having text a\nb\nd.
Cell size behavior on merge
Cell width and height, if present, are added when cells are merged:
>>> a, b = row.cells[:2]
>>> a.width.inches, b.width.inches
(1.0, 1.0)
>>> A = a.merge(b)
>>> A.width.inches
2.0
Collapsing a column. When all cells in a grid column share the same w:gridSpan specification, the spanned
columns can be collapsed into a single column by removing the w:gridSpan attributes.
Word behavior
Row and Column access in the MS API just plain breaks when the table is not uniform. Table.Rows(n) and
Cell.Row raise EnvironmentError when a table contains a vertical span, and Table.Columns(n) and Cell.Column
unconditionally raise EnvironmentError when the table contains a horizontal span. We can do better.
Table.Cell(n, m) works on any non-uniform table, although it uses a visual grid that greatly complicates access.
It raises an error for n or m out of visual range, and provides no way other than try/except to determine what that
visual range is, since Row.Count and Column.Count are unavailable.
In a merge operation, the text of the continuation cells is appended to that of the origin cell as separate paragraph(s).
If a merge range contains previously merged cells, the range must completely enclose the merged cells.
Word resizes a table (adds rows) when a cell is referenced by an out-of-bounds row index. If the column
identifier is out of bounds, an exception is raised. This behavior will not be implemented in python-docx.
Glossary
layout grid The regular two-dimensional matrix of rows and columns that determines the layout of cells in the table.
The grid is primarily defined by the w:gridCol elements that define the layout columns for the table. Each row
essentially duplicates that layout for an additional row, although its height can differ from other rows. Every
actual cell in the table must begin and end on a layout grid line, whether the cell is merged or not.
span The single combined cell occupying the area of a set of merged cells.
4.1. Analysis
113
skipped cell The WordprocessingML (WML) spec allows for skipped cells, where a layout cell location contains
no actual cell. I cant find a way to make a table like this using the Word UI and havent experimented yet to see
whether Word will load one constructed by hand in the XML.
uniform table A table in which each cell corresponds exactly to a layout cell. A uniform table contains no spans or
skipped cells.
non-uniform table A table that contains one or more spans, such that not every cell corresponds to a single layout
cell. I suppose it would apply when there was one or more skipped cells too, but in this analysis the term is only
used to indicate a table with one or more spans.
uniform cell A cell not part of a span, occupying a single cell in the layout grid.
origin cell The top-leftmost cell in a span. Contrast with continuation cell.
continuation cell A layout cell that has been subsumed into a span. A continuation cell is mostly an abstract concept,
although a actual w:tc element will always exist in the XML for each continuation cell in a vertical span.
Understanding merge XML intuitively
A key insight is that merged cells always look like the diagram below. Horizontal spans are accomplished with a
single w:tc element in each row, using the gridSpan attribute to span additional grid columns. Vertical spans are
accomplished with an identical cell in each continuation row, having the same gridSpan value, and having vMerge set
to continue (the default). These vertical continuation cells are depicted in the diagrams below with a dashed top border
and a caret (^) in the left-most grid column to symbolize the continuation of the cell above.:
\ 0
1
2
3
0 +---+---+---+
| A
|
|
1 + - - - +---+
| ^
|
|
2 +---+---+---+
|
|
|
|
3 +---+---+---+
The table depicted above corresponds to this XML (minimized for clarity):
<w:tbl>
<w:tblGrid>
<w:gridCol/>
<w:gridCol/>
<w:gridCol/>
</w:tblGrid>
<w:tr>
<w:tc>
<w:tcPr>
<w:gridSpan w:val="2"/>
<w:vMerge w:val="restart"/>
</w:tcPr>
</w:tc>
<w:tc/>
</w:tr>
<w:tr>
<w:tc>
<w:tcPr>
<w:gridSpan w:val="2"/>
<w:vMerge/>
</w:tcPr>
</w:tc>
114
<w:tc/>
</w:tr>
<w:tr>
<w:tc/>
<w:tc/>
<w:tc/>
</w:tr>
</w:tbl>
XML Semantics
In a horizontal merge, the <w:tc w:gridSpan="?"> attribute indicates the number of columns the cell should
span. Only the leftmost cell is preserved; the remaining cells in the merge are deleted.
For merging vertically, the w:vMerge table cell property of the uppermost cell of the column is set to the value
restart of type w:ST_Merge. The following, lower cells included in the vertical merge must have the w:vMerge
element present in their cell property (w:TcPr) element. Its value should be set to continue, although it is not
necessary to explicitely define it, as it is the default value. A vertical merge ends as soon as a cell w:TcPr element
lacks the w:vMerge element. Similarly to the w:gridSpan element, the w:vMerge elements are only required
when the tables layout is not uniform across its different columns. In the case it is, only the topmost cell is kept; the
other lower cells in the merged area are deleted along with their w:vMerge elements and the w:trHeight table
row property is used to specify the combined height of the merged cells.
len() implementation for Row.cells and Column.cells
Each Row and Column object provides access to the collection of cells it contains. The length of these cell collections
is unaffected by the presence of merged cells.
len() always bases its count on the layout grid, as though there were no merged cells.
len(Table.columns) is the number of w:gridCol elements, representing the number of grid columns,
without regard to the presence of merged cells in the table.
len(Table.rows) is the number of w:tr elements, regardless of any merged cells that may be present in the
table.
len(Row.cells) is the number of grid columns, regardless of whether any cells in the row are merged.
len(Column.cells) is the number of rows in the table, regardless of whether any cells in the column are
merged.
Merging a cell already containing a span
One or both of the diagonal corner cells in a merge operation may itself be a merged cell, as long as the specified
region is rectangular.
For example:
\
0
1
2
3
+---+---+---+---+
0 | a
| b |
|
+ - - - +---+---+
1 | ^
| C |
|
+---+---+---+---+
2 |
|
|
|
|
4.1. Analysis
-->
+---+---+---+---+
| a\nb\nC
|
|
+ - - - - - +---+
| ^
|
|
+---+---+---+---+
|
|
|
|
|
115
+---+---+---+---+
3 |
|
|
|
|
+---+---+---+---+
+---+---+---+---+
|
|
|
|
|
+---+---+---+---+
or:
0
1
2
3
0
1
2
3
4
+---+---+---+---+---+
| a
| b | c |
|
+ - - - +---+---+---+
| ^
| D
|
|
+---+---+---+---+---+
|
|
|
|
|
|
+---+ - - - +---+---+
|
|
|
|
|
|
+---+---+---+---+---+
-->
+---+---+---+---+---+
| abcD
|
|
+ - - - - - - - +---+
| ^
|
|
+---+---+---+---+---+
|
|
|
|
|
|
+---+---+---+---+---+
|
|
|
|
|
|
+---+---+---+---+---+
0
1
2
3
4
0 +---+---+---+---+
|
|
|
|
|
1 +---+---+---+---+
|
| a |
|
|
2 +---+---+---+---+
| b
|
|
3 +---+---+---+---+
|
|
|
|
|
4 +---+---+---+---+
a.merge(b)
General algorithm
find top-left and target width, height
for each tr in target height, tc.grow_right(target_width)
Specimen XML
A 3 x 3 table where an area defined by the 2 x 2 topleft cells has been merged, demonstrating the combined use of the
w:gridSpan as well as the w:vMerge elements, as produced by Word:
<w:tbl>
<w:tblPr>
<w:tblW w:w="0" w:type="auto" />
</w:tblPr>
<w:tblGrid>
<w:gridCol w:w="3192" />
<w:gridCol w:w="3192" />
<w:gridCol w:w="3192" />
</w:tblGrid>
<w:tr>
116
<w:tc>
<w:tcPr>
<w:tcW w:w="6384" w:type="dxa"
<w:gridSpan w:val="2" />
<w:vMerge w:val="restart" />
</w:tcPr>
</w:tc>
<w:tc>
<w:tcPr>
<w:tcW w:w="3192" w:type="dxa"
</w:tcPr>
</w:tc>
</w:tr>
<w:tr>
<w:tc>
<w:tcPr>
<w:tcW w:w="6384" w:type="dxa"
<w:gridSpan w:val="2" />
<w:vMerge />
</w:tcPr>
</w:tc>
<w:tc>
<w:tcPr>
<w:tcW w:w="3192" w:type="dxa"
</w:tcPr>
</w:tc>
</w:tr>
<w:tr>
<w:tc>
<w:tcPr>
<w:tcW w:w="3192" w:type="dxa"
</w:tcPr>
</w:tc>
<w:tc>
<w:tcPr>
<w:tcW w:w="3192" w:type="dxa"
</w:tcPr>
</w:tc>
<w:tc>
<w:tcPr>
<w:tcW w:w="3192" w:type="dxa"
</w:tcPr>
</w:tc>
</w:tr>
</w:tbl>
/>
/>
/>
/>
/>
/>
/>
Schema excerpt
<xsd:complexType name="CT_Tc"> <!-- denormalized -->
<xsd:sequence>
<xsd:element name="tcPr" type="CT_TcPr" minOccurs="0"/>
<xsd:choice minOccurs="1" maxOccurs="unbounded">
<xsd:element name="p"
type="CT_P"/>
<xsd:element name="tbl"
type="CT_Tbl"/>
<xsd:element name="customXml"
type="CT_CustomXmlBlock"/>
<xsd:element name="sdt"
type="CT_SdtBlock"/>
<xsd:element name="proofErr"
type="CT_ProofErr"/>
4.1. Analysis
117
<xsd:element name="permStart"
type="CT_PermStart"/>
<xsd:element name="permEnd"
type="CT_Perm"/>
<xsd:element name="ins"
type="CT_RunTrackChange"/>
<xsd:element name="del"
type="CT_RunTrackChange"/>
<xsd:element name="moveFrom"
type="CT_RunTrackChange"/>
<xsd:element name="moveTo"
type="CT_RunTrackChange"/>
<xsd:element ref="m:oMathPara"
type="CT_OMathPara"/>
<xsd:element ref="m:oMath"
type="CT_OMath"/>
<xsd:element name="bookmarkStart"
type="CT_Bookmark"/>
<xsd:element name="bookmarkEnd"
type="CT_MarkupRange"/>
<xsd:element name="moveFromRangeStart"
type="CT_MoveBookmark"/>
<xsd:element name="moveFromRangeEnd"
type="CT_MarkupRange"/>
<xsd:element name="moveToRangeStart"
type="CT_MoveBookmark"/>
<xsd:element name="moveToRangeEnd"
type="CT_MarkupRange"/>
<xsd:element name="commentRangeStart"
type="CT_MarkupRange"/>
<xsd:element name="commentRangeEnd"
type="CT_MarkupRange"/>
<xsd:element name="customXmlInsRangeStart"
type="CT_TrackChange"/>
<xsd:element name="customXmlInsRangeEnd"
type="CT_Markup"/>
<xsd:element name="customXmlDelRangeStart"
type="CT_TrackChange"/>
<xsd:element name="customXmlDelRangeEnd"
type="CT_Markup"/>
<xsd:element name="customXmlMoveFromRangeStart" type="CT_TrackChange"/>
<xsd:element name="customXmlMoveFromRangeEnd"
type="CT_Markup"/>
<xsd:element name="customXmlMoveToRangeStart"
type="CT_TrackChange"/>
<xsd:element name="customXmlMoveToRangeEnd"
type="CT_Markup"/>
<xsd:element name="altChunk"
type="CT_AltChunk"/>
</xsd:choice>
</xsd:sequence>
<xsd:attribute name="id" type="s:ST_String" use="optional"/>
</xsd:complexType>
<xsd:complexType name="CT_TcPr"> <!-- denormalized -->
<xsd:sequence>
<xsd:element name="cnfStyle"
type="CT_Cnf"
minOccurs="0"/>
<xsd:element name="tcW"
type="CT_TblWidth"
minOccurs="0"/>
<xsd:element name="gridSpan"
type="CT_DecimalNumber" minOccurs="0"/>
<xsd:element name="hMerge"
type="CT_HMerge"
minOccurs="0"/>
<xsd:element name="vMerge"
type="CT_VMerge"
minOccurs="0"/>
<xsd:element name="tcBorders"
type="CT_TcBorders"
minOccurs="0"/>
<xsd:element name="shd"
type="CT_Shd"
minOccurs="0"/>
<xsd:element name="noWrap"
type="CT_OnOff"
minOccurs="0"/>
<xsd:element name="tcMar"
type="CT_TcMar"
minOccurs="0"/>
<xsd:element name="textDirection"
type="CT_TextDirection" minOccurs="0"/>
<xsd:element name="tcFitText"
type="CT_OnOff"
minOccurs="0"/>
<xsd:element name="vAlign"
type="CT_VerticalJc"
minOccurs="0"/>
<xsd:element name="hideMark"
type="CT_OnOff"
minOccurs="0"/>
<xsd:element name="headers"
type="CT_Headers"
minOccurs="0"/>
<xsd:choice minOccurs="0">
<xsd:element name="cellIns"
type="CT_TrackChange"/>
<xsd:element name="cellDel"
type="CT_TrackChange"/>
<xsd:element name="cellMerge"
type="CT_CellMergeTrackChange"/>
</xsd:choice>
<xsd:element name="tcPrChange"
type="CT_TcPrChange"
minOccurs="0"/>
</xsd:sequence>
</xsd:complexType>
<xsd:complexType name="CT_DecimalNumber">
<xsd:attribute name="val" type="ST_DecimalNumber" use="required"/>
</xsd:complexType>
118
<xsd:simpleType name="ST_DecimalNumber">
<xsd:restriction base="xsd:integer"/>
</xsd:simpleType>
<xsd:complexType name="CT_VMerge">
<xsd:attribute name="val" type="ST_Merge"/>
</xsd:complexType>
<xsd:complexType name="CT_HMerge">
<xsd:attribute name="val" type="ST_Merge"/>
</xsd:complexType>
<xsd:simpleType name="ST_Merge">
<xsd:restriction base="xsd:string">
<xsd:enumeration value="continue"/>
<xsd:enumeration value="restart"/>
</xsd:restriction>
</xsd:simpleType>
Open Issues
Does Word allow skipped cells at the beginning of a row (w:gridBefore element)? These are described in the
spec, but I dont see a way in the Word UI to create such a table.
Ressources
4.1. Analysis
119
</w:tblPr>
<w:tblGrid>
<w:gridCol w:w="4788"/>
<w:gridCol w:w="4788"/>
</w:tblGrid>
<w:tr>
<w:tc/>
<w:tcPr>
<w:tcW w:type="dxa"
</w:tcPr>
<w:p/>
</w:tc>
<w:tc>
<w:tcPr>
<w:tcW w:type="dxa"
</w:tcPr>
<w:p/>
</w:tc>
</w:tr>
<w:tr>
<w:tc>
<w:tcPr>
<w:tcW w:type="dxa"
</w:tcPr>
<w:p/>
</w:tc>
<w:tc>
<w:tcPr>
<w:tcW w:type="dxa"
</w:tcPr>
<w:p/>
</w:tc>
</w:tr>
</w:tbl>
w:w="4788"/>
w:w="4788"/>
w:w="4788"/>
w:w="4788"/>
Schema Definitions
<xsd:complexType name="CT_Tbl"> <!-- denormalized -->
<xsd:sequence>
<xsd:group
ref="EG_RangeMarkupElements"
minOccurs="0" maxOccurs="unbounded"/>
<xsd:element name="tblPr"
type="CT_TblPr"/>
<xsd:element name="tblGrid"
type="CT_TblGrid"/>
<xsd:choice
minOccurs="0" maxOccurs="unbounded">
<xsd:element name="tr"
type="CT_Row"/>
<xsd:element name="customXml" type="CT_CustomXmlRow"/>
<xsd:element name="sdt"
type="CT_SdtRow"/>
<xsd:group
ref="EG_RunLevelElts"
minOccurs="0" maxOccurs="unbounded"/>
</xsd:choice>
</xsd:sequence>
</xsd:complexType>
<xsd:complexType
<xsd:sequence>
<xsd:element
<xsd:element
<xsd:element
<xsd:element
120
name="CT_TblPr">
name="tblStyle"
name="tblpPr"
name="tblOverlap"
name="bidiVisual"
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
<xsd:element name="tblStyleRowBandSize"
<xsd:element name="tblStyleColBandSize"
<xsd:element name="tblW"
<xsd:element name="jc"
<xsd:element name="tblCellSpacing"
<xsd:element name="tblInd"
<xsd:element name="tblBorders"
<xsd:element name="shd"
<xsd:element name="tblLayout"
<xsd:element name="tblCellMar"
<xsd:element name="tblLook"
<xsd:element name="tblCaption"
<xsd:element name="tblDescription"
<xsd:element name="tblPrChange"
</xsd:sequence>
type="CT_DecimalNumber"
type="CT_DecimalNumber"
type="CT_TblWidth"
type="CT_JcTable"
type="CT_TblWidth"
type="CT_TblWidth"
type="CT_TblBorders"
type="CT_Shd"
type="CT_TblLayoutType"
type="CT_TblCellMar"
type="CT_TblLook"
type="CT_String"
type="CT_String"
type="CT_TblPrChange"
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
4.1. Analysis
121
<xsd:element
<xsd:element
<xsd:element
<xsd:element
<xsd:element
<xsd:element
<xsd:element
<xsd:element
<xsd:element
<xsd:element
<xsd:element
<xsd:element
<xsd:element
<xsd:element
<xsd:element
<xsd:element
<xsd:element
<xsd:element
<xsd:element
</xsd:choice>
</xsd:group>
name="moveTo"
ref="m:oMathPara"
ref="m:oMath"
name="bookmarkStart"
name="bookmarkEnd"
name="moveFromRangeStart"
name="moveFromRangeEnd"
name="moveToRangeStart"
name="moveToRangeEnd"
name="commentRangeStart"
name="commentRangeEnd"
name="customXmlInsRangeStart"
name="customXmlInsRangeEnd"
name="customXmlDelRangeStart"
name="customXmlDelRangeEnd"
name="customXmlMoveFromRangeStart"
name="customXmlMoveFromRangeEnd"
name="customXmlMoveToRangeStart"
name="customXmlMoveToRangeEnd"
<xsd:group name="EG_RangeMarkupElements">
<xsd:choice>
<xsd:element name="bookmarkStart"
<xsd:element name="bookmarkEnd"
<xsd:element name="moveFromRangeStart"
<xsd:element name="moveFromRangeEnd"
<xsd:element name="moveToRangeStart"
<xsd:element name="moveToRangeEnd"
<xsd:element name="commentRangeStart"
<xsd:element name="commentRangeEnd"
<xsd:element name="customXmlInsRangeStart"
<xsd:element name="customXmlInsRangeEnd"
<xsd:element name="customXmlDelRangeStart"
<xsd:element name="customXmlDelRangeEnd"
<xsd:element name="customXmlMoveFromRangeStart"
<xsd:element name="customXmlMoveFromRangeEnd"
<xsd:element name="customXmlMoveToRangeStart"
<xsd:element name="customXmlMoveToRangeEnd"
</xsd:choice>
</xsd:group>
type="CT_RunTrackChange"/>
type="CT_OMathPara"/>
type="CT_OMath"/>
type="CT_Bookmark"/>
type="CT_MarkupRange"/>
type="CT_MoveBookmark"/>
type="CT_MarkupRange"/>
type="CT_MoveBookmark"/>
type="CT_MarkupRange"/>
type="CT_MarkupRange"/>
type="CT_MarkupRange"/>
type="CT_TrackChange"/>
type="CT_Markup"/>
type="CT_TrackChange"/>
type="CT_Markup"/>
type="CT_TrackChange"/>
type="CT_Markup"/>
type="CT_TrackChange"/>
type="CT_Markup"/>
type="CT_Bookmark"/>
type="CT_MarkupRange"/>
type="CT_MoveBookmark"/>
type="CT_MarkupRange"/>
type="CT_MoveBookmark"/>
type="CT_MarkupRange"/>
type="CT_MarkupRange"/>
type="CT_MarkupRange"/>
type="CT_TrackChange"/>
type="CT_Markup"/>
type="CT_TrackChange"/>
type="CT_Markup"/>
type="CT_TrackChange"/>
type="CT_Markup"/>
type="CT_TrackChange"/>
type="CT_Markup"/>
<xsd:simpleType name="ST_TwipsMeasure">
<xsd:union memberTypes="ST_UnsignedDecimalNumber ST_PositiveUniversalMeasure"/>
</xsd:simpleType>
<xsd:simpleType name="ST_UnsignedDecimalNumber">
<xsd:restriction base="xsd:unsignedLong"/>
</xsd:simpleType>
<xsd:simpleType name="ST_PositiveUniversalMeasure">
<xsd:restriction base="ST_UniversalMeasure">
<xsd:pattern value="[0-9]+(\.[0-9]+)?(mm|cm|in|pt|pc|pi)"/>
</xsd:restriction>
</xsd:simpleType>
122
Resources
Word allows a table to be aligned between the page margins either left, right, or center.
The read/write Table.alignment property specifies the alignment for a table:
>>> table = document.add_table(rows=2, cols=2)
>>> table.alignment
None
>>> table.alignment = WD_TABLE_ALIGNMENT.RIGHT
>>> table.alignment
RIGHT (2)
Autofit
Word has two algorithms for laying out a table, fixed-width or autofit. The default is autofit. Word will adjust column
widths in an autofit table based on cell contents. A fixed-width table retains its column widths regardless of the
contents. Either algorithm will adjust column widths proportionately when total table width exceeds page width.
The read/write Table.allow_autofit property specifies which algorithm is used:
>>> table = document.add_table(rows=2, cols=2)
>>> table.allow_autofit
True
>>> table.allow_autofit = False
>>> table.allow_autofit
False
Specimen XML
4.1. Analysis
123
<w:tcW w:type="dxa"
</w:tcPr>
<w:p/>
</w:tc>
<w:tc>
<w:tcPr>
<w:tcW w:type="dxa"
</w:tcPr>
<w:p/>
</w:tc>
</w:tr>
<w:tr>
<w:tc>
<w:tcPr>
<w:tcW w:type="dxa"
</w:tcPr>
<w:p/>
</w:tc>
<w:tc>
<w:tcPr>
<w:tcW w:type="dxa"
</w:tcPr>
<w:p/>
</w:tc>
</w:tr>
</w:tbl>
w:w="4788"/>
w:w="4788"/>
w:w="4788"/>
w:w="4788"/>
Layout behavior
Auto-layout causes actual column widths to be both unpredictable and unstable. Changes to the content can make the
table layout shift.
Semantics of CT_TblWidth element
e.g. tcW:
<w:tcW w:w="42.4mm"/>
<w:tcW w:w="1800" w:type="dxa"/>
<w:tcW w:w="20%" w:type="pct"/>
<w:tcW w:w="0" w:type="auto"/>
<w:tcW w:type="nil"/>
ST_MeasurementOrPercent
|
+-- ST_DecimalNumberOrPercent
|
|
|
+-- ST_UnqualifiedPercentage
|
|
|
|
|
+-- XsdInteger e.g. '1440'
|
|
124
|
+-- ST_Percentage e.g. '-07.43%'
|
+-- ST_UniversalMeasure e.g. '-04.34mm'
Schema Definitions
<xsd:complexType name="CT_Tbl"> <!-- denormalized -->
<xsd:sequence>
<xsd:group
ref="EG_RangeMarkupElements"
minOccurs="0" maxOccurs="unbounded"/>
<xsd:element name="tblPr"
type="CT_TblPr"/>
<xsd:element name="tblGrid"
type="CT_TblGrid"/>
<xsd:choice
minOccurs="0" maxOccurs="unbounded">
<xsd:element name="tr"
type="CT_Row"/>
<xsd:element name="customXml" type="CT_CustomXmlRow"/>
<xsd:element name="sdt"
type="CT_SdtRow"/>
<xsd:group
ref="EG_RunLevelElts"
minOccurs="0" maxOccurs="unbounded"/>
</xsd:choice>
</xsd:sequence>
</xsd:complexType>
<xsd:complexType name="CT_TblPr"> <!-- denormalized -->
<xsd:sequence>
<xsd:element name="tblStyle"
type="CT_String"
<xsd:element name="tblpPr"
type="CT_TblPPr"
<xsd:element name="tblOverlap"
type="CT_TblOverlap"
<xsd:element name="bidiVisual"
type="CT_OnOff"
<xsd:element name="tblStyleRowBandSize" type="CT_DecimalNumber"
<xsd:element name="tblStyleColBandSize" type="CT_DecimalNumber"
<xsd:element name="tblW"
type="CT_TblWidth"
<xsd:element name="jc"
type="CT_JcTable"
<xsd:element name="tblCellSpacing"
type="CT_TblWidth"
<xsd:element name="tblInd"
type="CT_TblWidth"
<xsd:element name="tblBorders"
type="CT_TblBorders"
<xsd:element name="shd"
type="CT_Shd"
<xsd:element name="tblLayout"
type="CT_TblLayoutType"
<xsd:element name="tblCellMar"
type="CT_TblCellMar"
<xsd:element name="tblLook"
type="CT_TblLook"
<xsd:element name="tblCaption"
type="CT_String"
<xsd:element name="tblDescription"
type="CT_String"
<xsd:element name="tblPrChange"
type="CT_TblPrChange"
</xsd:sequence>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
4.1. Analysis
125
</xsd:simpleType>
<!-- table width ------------------------------------- -->
<xsd:complexType name="CT_TblWidth">
<xsd:attribute name="w"
type="ST_MeasurementOrPercent"/>
<xsd:attribute name="type" type="ST_TblWidth"/>
</xsd:complexType>
<xsd:simpleType name="ST_MeasurementOrPercent">
<xsd:union memberTypes="ST_DecimalNumberOrPercent s:ST_UniversalMeasure"/>
</xsd:simpleType>
<xsd:simpleType name="ST_DecimalNumberOrPercent">
<xsd:union memberTypes="ST_UnqualifiedPercentage s:ST_Percentage"/>
</xsd:simpleType>
<xsd:simpleType name="ST_UniversalMeasure">
<xsd:restriction base="xsd:string">
<xsd:pattern value="-?[0-9]+(\.[0-9]+)?(mm|cm|in|pt|pc|pi)"/>
</xsd:restriction>
</xsd:simpleType>
<xsd:simpleType name="ST_UnqualifiedPercentage">
<xsd:restriction base="xsd:integer"/>
</xsd:simpleType>
<xsd:simpleType name="ST_Percentage">
<xsd:restriction base="xsd:string">
<xsd:pattern value="-?[0-9]+(\.[0-9]+)?%"/>
</xsd:restriction>
</xsd:simpleType>
<xsd:simpleType name="ST_TblWidth">
<xsd:restriction base="xsd:string">
<xsd:enumeration value="nil"/>
<xsd:enumeration value="pct"/>
<xsd:enumeration value="dxa"/>
<xsd:enumeration value="auto"/>
</xsd:restriction>
</xsd:simpleType>
<!-- table layout ------------------------------------ -->
<xsd:complexType name="CT_TblLayoutType">
<xsd:attribute name="type" type="ST_TblLayoutType"/>
</xsd:complexType>
<xsd:simpleType name="ST_TblLayoutType">
<xsd:restriction base="xsd:string">
<xsd:enumeration value="fixed"/>
<xsd:enumeration value="autofit"/>
</xsd:restriction>
</xsd:simpleType>
<!-- table look -------------------------------------- -->
<xsd:complexType name="CT_TblLook">
126
<xsd:attribute name="firstRow"
<xsd:attribute name="lastRow"
<xsd:attribute name="firstColumn"
<xsd:attribute name="lastColumn"
<xsd:attribute name="noHBand"
<xsd:attribute name="noVBand"
<xsd:attribute name="val"
</xsd:complexType>
type="s:ST_OnOff"/>
type="s:ST_OnOff"/>
type="s:ST_OnOff"/>
type="s:ST_OnOff"/>
type="s:ST_OnOff"/>
type="s:ST_OnOff"/>
type="ST_ShortHexNumber"/>
Table Cell
All content in a table is contained in a cell. A cell also has several properties affecting its size, appearance, and how
the content it contains is formatted.
MS API - Partial Summary
Merge()
Split()
Borders
BottomPadding (and Left, Right, Top)
Column
ColumnIndex
FitText
Height
HeightRule (one of WdRowHeightRule enumeration)
Preferred Width
Row
RowIndex
Shading
Tables
VerticalAlignment
Width
WordWrap
Specimen XML
<w:tc>
<w:tcPr>
<w:tcW w:w="7038" w:type="dxa"/>
</w:tcPr>
<w:p>
<w:pPr>
<w:pStyle w:val="ListBullet"/>
</w:pPr>
4.1. Analysis
127
<w:r>
<w:t>Amy earned her BA in American Studies</w:t>
</w:r>
</w:p>
</w:tc>
Schema Definitions
<xsd:complexType name="CT_Tc"> <!-- denormalized -->
<xsd:sequence>
<xsd:element name="tcPr" type="CT_TcPr" minOccurs="0"/>
<xsd:choice minOccurs="1" maxOccurs="unbounded">
<xsd:element name="p"
type="CT_P"/>
<xsd:element name="tbl"
type="CT_Tbl"/>
<xsd:element name="customXml"
type="CT_CustomXmlBlock"/>
<xsd:element name="sdt"
type="CT_SdtBlock"/>
<xsd:element name="proofErr"
type="CT_ProofErr"/>
<xsd:element name="permStart"
type="CT_PermStart"/>
<xsd:element name="permEnd"
type="CT_Perm"/>
<xsd:element name="ins"
type="CT_RunTrackChange"/>
<xsd:element name="del"
type="CT_RunTrackChange"/>
<xsd:element name="moveFrom"
type="CT_RunTrackChange"/>
<xsd:element name="moveTo"
type="CT_RunTrackChange"/>
<xsd:element ref="m:oMathPara"
type="CT_OMathPara"/>
<xsd:element ref="m:oMath"
type="CT_OMath"/>
<xsd:element name="bookmarkStart"
type="CT_Bookmark"/>
<xsd:element name="bookmarkEnd"
type="CT_MarkupRange"/>
<xsd:element name="moveFromRangeStart"
type="CT_MoveBookmark"/>
<xsd:element name="moveFromRangeEnd"
type="CT_MarkupRange"/>
<xsd:element name="moveToRangeStart"
type="CT_MoveBookmark"/>
<xsd:element name="moveToRangeEnd"
type="CT_MarkupRange"/>
<xsd:element name="commentRangeStart"
type="CT_MarkupRange"/>
<xsd:element name="commentRangeEnd"
type="CT_MarkupRange"/>
<xsd:element name="customXmlInsRangeStart"
type="CT_TrackChange"/>
<xsd:element name="customXmlInsRangeEnd"
type="CT_Markup"/>
<xsd:element name="customXmlDelRangeStart"
type="CT_TrackChange"/>
<xsd:element name="customXmlDelRangeEnd"
type="CT_Markup"/>
<xsd:element name="customXmlMoveFromRangeStart" type="CT_TrackChange"/>
<xsd:element name="customXmlMoveFromRangeEnd"
type="CT_Markup"/>
<xsd:element name="customXmlMoveToRangeStart"
type="CT_TrackChange"/>
<xsd:element name="customXmlMoveToRangeEnd"
type="CT_Markup"/>
<xsd:element name="altChunk"
type="CT_AltChunk"/>
</xsd:choice>
</xsd:sequence>
<xsd:attribute name="id" type="s:ST_String" use="optional"/>
</xsd:complexType>
<xsd:complexType
<xsd:sequence>
<xsd:element
<xsd:element
<xsd:element
<xsd:element
<xsd:element
<xsd:element
<xsd:element
<xsd:element
128
name="CT_TcPr">
name="cnfStyle"
name="tcW"
name="gridSpan"
name="hMerge"
name="vMerge"
name="tcBorders"
name="shd"
name="noWrap"
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
<xsd:element name="tcMar"
<xsd:element name="textDirection"
<xsd:element name="tcFitText"
<xsd:element name="vAlign"
<xsd:element name="hideMark"
<xsd:element name="headers"
<xsd:choice
<xsd:element name="cellIns"
<xsd:element name="cellDel"
<xsd:element name="cellMerge"
</xsd:choice>
<xsd:element name="tcPrChange"
</xsd:sequence>
</xsd:complexType>
type="CT_TcMar"
type="CT_TextDirection"
type="CT_OnOff"
type="CT_VerticalJc"
type="CT_OnOff"
type="CT_Headers"
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
type="CT_TrackChange"/>
type="CT_TrackChange"/>
type="CT_CellMergeTrackChange"/>
type="CT_TcPrChange"
minOccurs="0"/>
<xsd:complexType name="CT_TblWidth">
<xsd:attribute name="w"
type="ST_MeasurementOrPercent"/>
<xsd:attribute name="type" type="ST_TblWidth"/>
</xsd:complexType>
<xsd:simpleType name="ST_MeasurementOrPercent">
<xsd:union memberTypes="ST_DecimalNumberOrPercent s:ST_UniversalMeasure"/>
</xsd:simpleType>
<xsd:simpleType name="ST_DecimalNumberOrPercent">
<xsd:union memberTypes="ST_UnqualifiedPercentage s:ST_Percentage"/>
</xsd:simpleType>
<xsd:simpleType name="ST_UniversalMeasure">
<xsd:restriction base="xsd:string">
<xsd:pattern value="-?[0-9]+(\.[0-9]+)?(mm|cm|in|pt|pc|pi)"/>
</xsd:restriction>
</xsd:simpleType>
<xsd:simpleType name="ST_UnqualifiedPercentage">
<xsd:restriction base="xsd:integer"/>
</xsd:simpleType>
<xsd:simpleType name="ST_Percentage">
<xsd:restriction base="xsd:string">
<xsd:pattern value="-?[0-9]+(\.[0-9]+)?%"/>
</xsd:restriction>
</xsd:simpleType>
<xsd:simpleType name="ST_TblWidth">
<xsd:restriction base="xsd:string">
<xsd:enumeration value="nil"/>
<xsd:enumeration value="pct"/>
<xsd:enumeration value="dxa"/>
<xsd:enumeration value="auto"/>
</xsd:restriction>
</xsd:simpleType>
Numbering Part
... having to do with numbering sequences for ordered lists, etc. ...
4.1. Analysis
129
Schema excerpt
<xsd:complexType name="CT_Numbering">
<xsd:sequence>
<xsd:element name="numPicBullet"
<xsd:element name="abstractNum"
<xsd:element name="num"
<xsd:element name="numIdMacAtCleanup"
</xsd:sequence>
</xsd:complexType>
type="CT_NumPicBullet"
type="CT_AbstractNum"
type="CT_Num"
type="CT_DecimalNumber"
minOccurs="0" maxOccurs="unbounded"
minOccurs="0" maxOccurs="unbounded"
minOccurs="0" maxOccurs="unbounded"
minOccurs="0"/>
<xsd:complexType name="CT_Num">
<xsd:sequence>
<xsd:element name="abstractNumId" type="CT_DecimalNumber"/>
<xsd:element name="lvlOverride"
type="CT_NumLvl"
minOccurs="0" maxOccurs="9"/>
</xsd:sequence>
<xsd:attribute name="numId" type="ST_DecimalNumber" use="required"/>
</xsd:complexType>
<xsd:complexType name="CT_NumLvl">
<xsd:sequence>
<xsd:element name="startOverride" type="CT_DecimalNumber" minOccurs="0"/>
<xsd:element name="lvl"
type="CT_Lvl"
minOccurs="0"/>
</xsd:sequence>
<xsd:attribute name="ilvl" type="ST_DecimalNumber" use="required"/>
</xsd:complexType>
<xsd:complexType name="CT_NumPr">
<xsd:sequence>
<xsd:element name="ilvl"
<xsd:element name="numId"
<xsd:element name="numberingChange"
<xsd:element name="ins"
</xsd:sequence>
</xsd:complexType>
type="CT_DecimalNumber"
type="CT_DecimalNumber"
type="CT_TrackChangeNumbering"
type="CT_TrackChange"
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
<xsd:complexType name="CT_DecimalNumber">
<xsd:attribute name="val" type="ST_DecimalNumber" use="required"/>
</xsd:complexType>
<xsd:simpleType name="ST_DecimalNumber">
<xsd:restriction base="xsd:integer"/>
</xsd:simpleType>
Sections
Word supports the notion of a section, having distinct page layout settings. This is how, for example, a document can
contain some pages in portrait layout and others in landscape. Section breaks are implemented completely differently
from line, page, and column breaks. The former adds a <w:pPr><w:sectPr> element to the last paragraph in the
new section. The latter inserts a <w:br> element in a run.
The last section in a document is specified by a <w:sectPr> element appearing as the last child of the <w:body>
element. While this element is optional, it appears that Word creates it for all files. Since most files have only a single
section, the most common case is where this is the only <w:sectPr> element.
Additional sections are specified by a w:p/w:pPr/w:sectPr element in the last paragraph of the section. Any
content in that paragraph is part of the section defined by its <w:sectPr> element. The subsequent section begins
130
A paragraph containing a section break (<w:sectPr> element) does not produce a glyph in the Word UI.
The section break indicator/double-line appears directly after the text of the paragraph in which the <w:sectPr>
appears. If the section break paragraph has no text, the indicator line appears immediately after the paragraph
mark of the prior paragraph.
Before and after analysis Baseline document containing two paragraphs:
<w:body>
<w:p>
<w:r>
<w:t>Paragraph 1</w:t>
</w:r>
</w:p>
<w:p>
<w:r>
<w:t>Paragraph 2</w:t>
</w:r>
</w:p>
<w:sectPr>
<w:pgSz w:w="12240" w:h="15840"/>
<w:pgMar w:top="1440" w:right="1800" w:bottom="1440" w:left="1800"
w:header="720" w:footer="720" w:gutter="0"/>
<w:cols w:space="720"/>
<w:docGrid w:linePitch="360"/>
</w:sectPr>
</w:body>
4.1. Analysis
131
</w:r>
</w:p>
<w:p/>
<w:p>
<w:r>
<w:t>Paragraph 2</w:t>
</w:r>
</w:p>
<w:sectPr w:rsidR="00F039D0" w:rsidSect="006006E7">
<w:type w:val="oddPage"/>
<w:pgSz w:w="12240" w:h="15840"/>
<w:pgMar w:top="1440" w:right="1800" w:bottom="1440" w:left="1800"
w:header="720" w:footer="720" w:gutter="0"/>
<w:cols w:space="720"/>
<w:docGrid w:linePitch="360"/>
</w:sectPr>
</w:body>
UI shows empty mark in first position of new next page. Section break indicator appears directly after Paragraph 1
text, with no intervening mark.
Even-page section break inserted before first character in Paragraph 2:
<w:body>
<w:p>
<w:r>
<w:t>Paragraph 1</w:t>
</w:r>
</w:p>
<w:p>
<w:pPr>
<w:sectPr>
<w:type w:val="oddPage"/>
<w:pgSz w:w="12240" w:h="15840"/>
<w:pgMar w:top="1440" w:right="1800" w:bottom="1440" w:left="1800"
w:header="720" w:footer="720" w:gutter="0"/>
<w:cols w:space="720"/>
<w:docGrid w:linePitch="360"/>
</w:sectPr>
</w:pPr>
</w:p>
<w:p>
<w:r>
<w:lastRenderedPageBreak/>
<w:t>Paragraph 2</w:t>
</w:r>
</w:p>
<w:sectPr>
<w:type w:val="evenPage"/>
<w:pgSz w:w="12240" w:h="15840"/>
<w:pgMar w:top="1440" w:right="1800" w:bottom="1440" w:left="1800"
w:header="720" w:footer="720" w:gutter="0"/>
<w:cols w:space="720"/>
<w:docGrid w:linePitch="360"/>
</w:sectPr>
</w:body>
132
Enumerations
4.1. Analysis
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
133
<xsd:element name="sectPrChange"
type="CT_SectPrChange"
</xsd:sequence>
<xsd:attribute name="rsidRPr" type="ST_LongHexNumber"/>
<xsd:attribute name="rsidDel" type="ST_LongHexNumber"/>
<xsd:attribute name="rsidR"
type="ST_LongHexNumber"/>
<xsd:attribute name="rsidSect" type="ST_LongHexNumber"/>
</xsd:complexType>
minOccurs="0"/>
<xsd:complexType name="CT_HdrFtrRef">
<xsd:attribute ref="r:id"
use="required"/>
<xsd:attribute name="type" type="ST_HdrFtr" use="required"/>
</xsd:complexType>
<xsd:simpleType name="ST_HdrFtr">
<xsd:restriction base="xsd:string">
<xsd:enumeration value="even"/>
<xsd:enumeration value="default"/>
<xsd:enumeration value="first"/>
</xsd:restriction>
</xsd:simpleType>
<xsd:complexType name="CT_SectType">
<xsd:attribute name="val" type="ST_SectionMark"/>
</xsd:complexType>
<xsd:simpleType name="ST_SectionMark">
<xsd:restriction base="xsd:string">
<xsd:enumeration value="nextPage"/>
<xsd:enumeration value="nextColumn"/>
<xsd:enumeration value="continuous"/>
<xsd:enumeration value="evenPage"/>
<xsd:enumeration value="oddPage"/>
</xsd:restriction>
</xsd:simpleType>
<xsd:complexType name="CT_PageSz">
<xsd:attribute name="w"
type="s:ST_TwipsMeasure"/>
<xsd:attribute name="h"
type="s:ST_TwipsMeasure"/>
<xsd:attribute name="orient" type="ST_PageOrientation"/>
<xsd:attribute name="code"
type="ST_DecimalNumber"/>
</xsd:complexType>
<xsd:simpleType name="ST_PageOrientation">
<xsd:restriction base="xsd:string">
<xsd:enumeration value="portrait"/>
<xsd:enumeration value="landscape"/>
</xsd:restriction>
</xsd:simpleType>
<xsd:complexType name="CT_PageMar">
<xsd:attribute name="top"
type="ST_SignedTwipsMeasure"
<xsd:attribute name="right" type="s:ST_TwipsMeasure"
<xsd:attribute name="bottom" type="ST_SignedTwipsMeasure"
<xsd:attribute name="left"
type="s:ST_TwipsMeasure"
<xsd:attribute name="header" type="s:ST_TwipsMeasure"
<xsd:attribute name="footer" type="s:ST_TwipsMeasure"
<xsd:attribute name="gutter" type="s:ST_TwipsMeasure"
</xsd:complexType>
134
use="required"/>
use="required"/>
use="required"/>
use="required"/>
use="required"/>
use="required"/>
use="required"/>
<xsd:simpleType name="ST_SignedTwipsMeasure">
<xsd:union memberTypes="xsd:integer s:ST_UniversalMeasure"/>
</xsd:simpleType>
<xsd:complexType name="CT_Columns">
<xsd:sequence minOccurs="0">
<xsd:element name="col" type="CT_Column" maxOccurs="45"/>
</xsd:sequence>
<xsd:attribute name="equalWidth" type="s:ST_OnOff"/>
<xsd:attribute name="space"
type="s:ST_TwipsMeasure"/>
<xsd:attribute name="num"
type="ST_DecimalNumber"/>
<xsd:attribute name="sep"
type="s:ST_OnOff"/>
</xsd:complexType>
<xsd:complexType name="CT_Column">
<xsd:attribute name="w"
type="s:ST_TwipsMeasure"/>
<xsd:attribute name="space" type="s:ST_TwipsMeasure"/>
</xsd:complexType>
A graphical object that appears in a Word document is known as a shape. A shape can be inline or floating. An inline
shape appears on a text baseline as though it were a character glyph and affects the line height. A floating shape
appears at an arbitrary location on the document and text may wrap around it. Several types of shape can be placed,
including a picture, a chart, and a drawing canvas.
The graphical object itself is placed in a container, and it is the container that determines the placement of the graphic.
The same graphical object can be placed inline or floating by changing its container. The graphic itself is unaffected.
MS API
Access to shapes is provided by the Shapes and InlineShapes properties on the Document object.
The API for a floating shape overlaps that for an inline shapes, but there are substantial differences. The following
properties are some of those common to both:
Fill
Glow
HasChart
HasSmartArt
Height
Shadow
Hyperlink
PictureFormat (providing brightness, color, crop, transparency, contrast)
Type (Chart, LockedCanvas, Picture, SmartArt, etc.)
Width
4.1. Analysis
135
Resources
Word allows a graphical object to be placed into a document as an inline object. An inline shape appears as a
<w:drawing> element as a child of a <w:r> element.
Candidate protocol inline shape access
The following interactive session illustrates the protocol for accessing an inline shape:
>>> inline_shapes = document.body.inline_shapes
>>> inline_shape = inline_shapes[0]
>>> assert inline_shape.type == MSO_SHAPE_TYPE.PICTURE
Resources
The Shapes and InlineShapes properties on Document hold references to things like pictures in the MS API.
Height and Width
Borders
Shadow
Hyperlink
PictureFormat (providing brightness, color, crop, transparency, contrast)
ScaleHeight and ScaleWidth
HasChart
HasSmartArt
Type (Chart, LockedCanvas, Picture, SmartArt, etc.)
136
Spec references
This XML represents my best guess of the minimal inline shape container that Word will load:
<w:r>
<w:drawing>
<wp:inline>
<wp:extent cx="914400" cy="914400"/>
<wp:docPr id="1" name="Picture 1"/>
<a:graphic xmlns:a="http://schemas.openxmlformats.org/drawingml/2006/main">
<a:graphicData uri="http://schemas.openxmlformats.org/drawingml/2006/picture">
<!-- might not have to put anything here for a start -->
</a:graphicData>
</a:graphic>
</wp:inline>
</w:drawing>
</w:r>
Specimen XML
A CT_Drawing (<w:drawing>) element can appear in a run, as a peer of, for example, a <w:t> element. This
element contains a DrawingML object. WordprocessingML drawings are discussed in section 20.4 of the ISO/IEC
spec.
This XML represents an inline shape inserted inline on a paragraph by itself. The particulars of the graphical object
itself are redacted:
<w:p>
<w:r>
<w:rPr/>
<w:noProof/>
</w:rPr>
<w:drawing>
<wp:inline distT="0" distB="0" distL="0" distR="0" wp14:anchorId="1BDE1558" wp14:editId="31E593
<wp:extent cx="859536" cy="343814"/>
<wp:effectExtent l="0" t="0" r="4445" b="12065"/>
<wp:docPr id="1" name="Picture 1"/>
<wp:cNvGraphicFramePr>
<a:graphicFrameLocks xmlns:a="http://schemas.openxmlformats.org/drawingml/2006/main" noChan
</wp:cNvGraphicFramePr>
<a:graphic xmlns:a="http://schemas.openxmlformats.org/drawingml/2006/main">
<a:graphicData uri="http://schemas.openxmlformats.org/drawingml/2006/picture">
<!-- graphical object, such as pic:pic, goes here -->
</a:graphicData>
</a:graphic>
4.1. Analysis
137
</wp:inline>
</w:drawing>
</w:r>
</w:p>
Schema definitions
<xsd:complexType name="CT_Drawing">
<xsd:choice minOccurs="1" maxOccurs="unbounded">
<xsd:element ref="wp:anchor" minOccurs="0"/>
<xsd:element ref="wp:inline" minOccurs="0"/>
</xsd:choice>
</xsd:complexType>
<xsd:complexType name="CT_Inline">
<xsd:sequence>
<xsd:element name="extent"
type="a:CT_PositiveSize2D"/>
<xsd:element name="effectExtent"
type="CT_EffectExtent"
minOccurs="0"/>
<xsd:element name="docPr"
type="a:CT_NonVisualDrawingProps"/>
<xsd:element name="cNvGraphicFramePr" type="a:CT_NonVisualGraphicFrameProperties" minOccurs="0"/>
<xsd:element name="graphic"
type="CT_GraphicalObject"/>
</xsd:sequence>
<xsd:attribute name="distT" type="ST_WrapDistance"/>
<xsd:attribute name="distB" type="ST_WrapDistance"/>
<xsd:attribute name="distL" type="ST_WrapDistance"/>
<xsd:attribute name="distR" type="ST_WrapDistance"/>
</xsd:complexType>
<xsd:complexType name="CT_PositiveSize2D">
<xsd:attribute name="cx" type="ST_PositiveCoordinate" use="required"/>
<xsd:attribute name="cy" type="ST_PositiveCoordinate" use="required"/>
</xsd:complexType>
<xsd:complexType name="CT_EffectExtent">
<xsd:attribute name="l" type="a:ST_Coordinate"
<xsd:attribute name="t" type="a:ST_Coordinate"
<xsd:attribute name="r" type="a:ST_Coordinate"
<xsd:attribute name="b" type="a:ST_Coordinate"
</xsd:complexType>
use="required"/>
use="required"/>
use="required"/>
use="required"/>
<xsd:complexType name="CT_NonVisualDrawingProps">
<xsd:sequence>
<xsd:element name="hlinkClick" type="CT_Hyperlink"
minOccurs="0"/>
<xsd:element name="hlinkHover" type="CT_Hyperlink"
minOccurs="0"/>
<xsd:element name="extLst"
type="CT_OfficeArtExtensionList" minOccurs="0"/>
</xsd:sequence>
<xsd:attribute name="id"
type="ST_DrawingElementId" use="required"/>
<xsd:attribute name="name"
type="xsd:string"
use="required"/>
<xsd:attribute name="descr" type="xsd:string"
default=""/>
<xsd:attribute name="hidden" type="xsd:boolean"
default="false"/>
<xsd:attribute name="title" type="xsd:string"
default=""/>
</xsd:complexType>
<xsd:complexType name="CT_NonVisualGraphicFrameProperties">
<xsd:sequence>
<xsd:element name="graphicFrameLocks" type="CT_GraphicalObjectFrameLocking" minOccurs="0"/>
138
<xsd:element name="extLst"
</xsd:sequence>
</xsd:complexType>
type="CT_OfficeArtExtensionList"
minOccurs="0"/>
<xsd:complexType name="CT_GraphicalObject">
<xsd:sequence>
<xsd:element name="graphicData" type="CT_GraphicalObjectData"/>
</xsd:sequence>
</xsd:complexType>
<xsd:complexType name="CT_GraphicalObjectData">
<xsd:sequence>
<xsd:any minOccurs="0" maxOccurs="unbounded" processContents="strict"/>
</xsd:sequence>
<xsd:attribute name="uri" type="xsd:token" use="required"/>
</xsd:complexType>
The position of an inline shape is completely determined by the text it is inline with, however its dimensions can be
specified. For some shape types, both the contained shape and the shape container specify a width and height. In the
case of a picture, the dimensions of the inline shape (container) determine the display size while the dimension of the
pic element determine the original size of the image.
Candidate protocol inline shape access
The following interactive session illustrates the protocol for accessing and changing the size of an inline shape:
>>> inline_shape = inline_shapes[0]
>>> assert inline_shape.type == MSO_SHAPE_TYPE.PICTURE
>>> inline_shape.width
914400
>>> inline_shape.height
457200
>>> inline_shape.width = 457200
>>> inline_shape.height = 228600
>>> inline_shape.width, inline_shape.height
457200, 228600
Resources
4.1. Analysis
139
<w:p>
<w:r>
<w:rPr/>
<w:noProof/>
</w:rPr>
<w:drawing>
<wp:inline distT="0" distB="0" distL="0" distR="0" wp14:anchorId="1BDE1558" wp14:editId="31E593
<wp:extent cx="859536" cy="343814"/>
<wp:effectExtent l="0" t="0" r="4445" b="12065"/>
<wp:docPr id="1" name="Picture 1"/>
<wp:cNvGraphicFramePr>
<a:graphicFrameLocks xmlns:a="http://schemas.openxmlformats.org/drawingml/2006/main" noChan
</wp:cNvGraphicFramePr>
<a:graphic xmlns:a="http://schemas.openxmlformats.org/drawingml/2006/main">
<a:graphicData uri="http://schemas.openxmlformats.org/drawingml/2006/picture">
<pic:pic xmlns:pic="http://schemas.openxmlformats.org/drawingml/2006/picture">
<pic:nvPicPr>
<pic:cNvPr id="1" name="python-powered.png"/>
<pic:cNvPicPr/>
</pic:nvPicPr>
<pic:blipFill>
<a:blip r:embed="rId7">
<a:alphaModFix/>
<a:extLst>
<a:ext uri="{28A0092B-C50C-407E-A947-70E740481C1C}">
<a14:useLocalDpi xmlns:a14="http://schemas.microsoft.com/office/drawing/2010/ma
</a:ext>
</a:extLst>
</a:blip>
<a:stretch>
<a:fillRect/>
</a:stretch>
</pic:blipFill>
<pic:spPr>
<a:xfrm>
<a:off x="0" y="0"/>
<a:ext cx="859536" cy="343814"/>
</a:xfrm>
<a:prstGeom prst="rect">
<a:avLst/>
</a:prstGeom>
</pic:spPr>
</pic:pic>
</a:graphicData>
</a:graphic>
</wp:inline>
</w:drawing>
</w:r>
</w:p>
Picture
Overview
Word allows a picture to be placed in a graphical object container, either an inline shape or a floating shape.
140
Candidate protocol
>>>
>>>
>>>
>>>
run = paragraph.add_run()
inline_shape = run.add_inline_picture(file_like_image, MIME_type=None)
inline_shape.width = width
inline_shape.height = height
Minimal XML
This XML represents the working hypothesis of the minimum XML that must be inserted to add a working picture to
a document:
<pic:pic xmlns:pic="http://schemas.openxmlformats.org/drawingml/2006/picture">
<pic:nvPicPr>
<pic:cNvPr id="1" name="python-powered.png"/>
<pic:cNvPicPr/>
</pic:nvPicPr>
<pic:blipFill>
<a:blip r:embed="rId7"/>
<a:stretch>
<a:fillRect/>
</a:stretch>
</pic:blipFill>
<pic:spPr>
<a:xfrm>
<a:off x="0" y="0"/>
<a:ext cx="859536" cy="343814"/>
</a:xfrm>
<a:prstGeom prst="rect"/>
</pic:spPr>
</pic:pic>
Required parameters:
unique DrawingML object id (document-wide, pretty sure its just the part)
name, either filename or generic if file-like object.
rId for rel to image part
size (cx, cy)
Specimen XML
4.1. Analysis
141
Schema definitions
<xsd:element name="pic" type="CT_Picture"/>
<xsd:complexType name="CT_Picture">
<xsd:sequence>
<xsd:element name="nvPicPr" type="CT_PictureNonVisual"/>
<xsd:element name="blipFill" type="a:CT_BlipFillProperties"/>
<xsd:element name="spPr"
type="a:CT_ShapeProperties"/>
</xsd:sequence>
</xsd:complexType>
<xsd:complexType name="CT_PictureNonVisual">
<xsd:sequence>
<xsd:element name="cNvPr"
type="a:CT_NonVisualDrawingProps"/>
<xsd:element name="cNvPicPr" type="a:CT_NonVisualPictureProperties"/>
</xsd:sequence>
</xsd:complexType>
<xsd:complexType name="CT_BlipFillProperties">
<xsd:sequence>
<xsd:element name="blip"
type="CT_Blip"
minOccurs="0"/>
<xsd:element name="srcRect" type="CT_RelativeRect" minOccurs="0"/>
<xsd:choice minOccurs="0">
<xsd:element name="tile"
type="CT_TileInfoProperties"/>
<xsd:element name="stretch" type="CT_StretchInfoProperties"/>
</xsd:choice>
</xsd:sequence>
<xsd:attribute name="dpi"
type="xsd:unsignedInt"/>
<xsd:attribute name="rotWithShape" type="xsd:boolean"/>
</xsd:complexType>
<xsd:complexType name="CT_ShapeProperties">
<xsd:sequence>
<xsd:element name="xfrm"
type="CT_Transform2D"
<xsd:group
ref="EG_Geometry"
142
minOccurs="0"/>
minOccurs="0"/>
<xsd:group
ref="EG_FillProperties"
<xsd:element name="ln"
type="CT_LineProperties"
<xsd:group
ref="EG_EffectProperties"
<xsd:element name="scene3d" type="CT_Scene3D"
<xsd:element name="sp3d"
type="CT_Shape3D"
<xsd:element name="extLst" type="CT_OfficeArtExtensionList"
</xsd:sequence>
<xsd:attribute name="bwMode" type="ST_BlackWhiteMode"/>
</xsd:complexType>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
minOccurs="0"/>
4.1. Analysis
minOccurs="0"/>
143
<xsd:element name="extLst"
type="CT_OfficeArtExtensionList" minOccurs="0"/>
</xsd:sequence>
<xsd:attribute name="preferRelativeResize" type="xsd:boolean" default="true"/>
</xsd:complexType>
<xsd:complexType name="CT_Point2D">
<xsd:attribute name="x" type="ST_Coordinate" use="required"/>
<xsd:attribute name="y" type="ST_Coordinate" use="required"/>
</xsd:complexType>
<xsd:complexType name="CT_PositiveSize2D">
<xsd:attribute name="cx" type="ST_PositiveCoordinate" use="required"/>
<xsd:attribute name="cy" type="ST_PositiveCoordinate" use="required"/>
</xsd:complexType>
<xsd:complexType name="CT_PresetGeometry2D">
<xsd:sequence>
<xsd:element name="avLst" type="CT_GeomGuideList" minOccurs="0"/>
</xsd:sequence>
<xsd:attribute name="prst" type="ST_ShapeType" use="required"/>
</xsd:complexType>
<xsd:complexType name="CT_RelativeRect">
<xsd:attribute name="l" type="ST_Percentage"
<xsd:attribute name="t" type="ST_Percentage"
<xsd:attribute name="r" type="ST_Percentage"
<xsd:attribute name="b" type="ST_Percentage"
</xsd:complexType>
default="0%"/>
default="0%"/>
default="0%"/>
default="0%"/>
<xsd:complexType name="CT_StretchInfoProperties">
<xsd:sequence>
<xsd:element name="fillRect" type="CT_RelativeRect" minOccurs="0"/>
</xsd:sequence>
</xsd:complexType>
<xsd:complexType name="CT_Transform2D">
<xsd:sequence>
<xsd:element name="off" type="CT_Point2D"
minOccurs="0"/>
<xsd:element name="ext" type="CT_PositiveSize2D" minOccurs="0"/>
</xsd:sequence>
<xsd:attribute name="rot"
type="ST_Angle"
default="0"/>
<xsd:attribute name="flipH" type="xsd:boolean" default="false"/>
<xsd:attribute name="flipV" type="xsd:boolean" default="false"/>
</xsd:complexType>
<xsd:group name="EG_FillModeProperties">
<xsd:choice>
<xsd:element name="tile"
type="CT_TileInfoProperties"/>
<xsd:element name="stretch" type="CT_StretchInfoProperties"/>
</xsd:choice>
</xsd:group>
<xsd:group name="EG_Geometry">
<xsd:choice>
<xsd:element name="custGeom" type="CT_CustomGeometry2D"/>
<xsd:element name="prstGeom" type="CT_PresetGeometry2D"/>
</xsd:choice>
</xsd:group>
144
Document
w:document
wordprocessingml (wml.xsd)
17.2.3
Spec text
This element specifies the contents of a main document part in a WordprocessingML document.
Consider the basic structure of the main document part in a basic WordprocessingML document, as follows:
<w:document>
<w:body>
<w:p/>
</w:body>
</w:document>
All of the contents of the main document part are contained beneath the document element.
Schema excerpt
<xsd:complexType name="CT_Document">
<xsd:sequence>
<xsd:element name="background" type="CT_Background" minOccurs="0"/>
<xsd:element name="body"
type="CT_Body"
minOccurs="0" maxOccurs="1"/>
</xsd:sequence>
<xsd:attribute name="conformance" type="s:ST_ConformanceClass"/>
</xsd:complexType>
<xsd:simpleType name="ST_ConformanceClass">
<xsd:restriction base="xsd:string">
<xsd:enumeration value="strict"/>
<xsd:enumeration value="transitional"/>
</xsd:restriction>
</xsd:simpleType>
<xsd:complexType name="CT_Background">
<xsd:sequence>
<xsd:sequence maxOccurs="unbounded">
<xsd:any processContents="lax" namespace="urn:schemas-microsoft-com:vml"
minOccurs="0" maxOccurs="unbounded"/>
<xsd:any processContents="lax" namespace="urn:schemas-microsoft-com:office:office"
minOccurs="0" maxOccurs="unbounded"/>
</xsd:sequence>
<xsd:element name="drawing" type="CT_Drawing" minOccurs="0"/>
</xsd:sequence>
<xsd:attribute name="color"
type="ST_HexColor"
use="optional"/>
4.1. Analysis
145
CT_Body
Schema Name
Spec Name
Tag(s)
Namespace
Spec Section
CT_Body
Document Body
w:body
wordprocessingml (wml.xsd)
17.2.2
Spec text
This element specifies the contents of the body of the document the main document editing surface.
146
The document body contains what is referred to as block-level markup markup which can exist as a
sibling element to paragraphs in a WordprocessingML document.
Example: Consider a document with a single paragraph in the main document story. This document would
require the following WordprocessingML in its main document part:
<w:document>
<w:body>
<w:p/>
</w:body>
</w:document>
The fact that the paragraph is inside the body element makes it part of the main document story.
Schema excerpt
<xsd:complexType name="CT_Body">
<xsd:sequence>
<xsd:choice minOccurs="0" maxOccurs="unbounded">
<xsd:element name="p"
type="CT_P"/>
<xsd:element name="tbl"
type="CT_Tbl"/>
<xsd:element name="customXml"
type="CT_CustomXmlBlock"/>
<xsd:element name="sdt"
type="CT_SdtBlock"/>
<xsd:element name="proofErr"
type="CT_ProofErr"/>
<xsd:element name="permStart"
type="CT_PermStart"/>
<xsd:element name="permEnd"
type="CT_Perm"/>
<xsd:element name="ins"
type="CT_RunTrackChange"/>
<xsd:element name="del"
type="CT_RunTrackChange"/>
<xsd:element name="moveFrom"
type="CT_RunTrackChange"/>
<xsd:element name="moveTo"
type="CT_RunTrackChange"/>
<xsd:element ref="m:oMathPara"
type="CT_OMathPara"/>
<xsd:element ref="m:oMath"
type="CT_OMath"/>
<xsd:element name="bookmarkStart"
type="CT_Bookmark"/>
<xsd:element name="bookmarkEnd"
type="CT_MarkupRange"/>
<xsd:element name="moveFromRangeStart"
type="CT_MoveBookmark"/>
<xsd:element name="moveFromRangeEnd"
type="CT_MarkupRange"/>
<xsd:element name="moveToRangeStart"
type="CT_MoveBookmark"/>
<xsd:element name="moveToRangeEnd"
type="CT_MarkupRange"/>
<xsd:element name="commentRangeStart"
type="CT_MarkupRange"/>
<xsd:element name="commentRangeEnd"
type="CT_MarkupRange"/>
<xsd:element name="customXmlInsRangeStart"
type="CT_TrackChange"/>
<xsd:element name="customXmlInsRangeEnd"
type="CT_Markup"/>
<xsd:element name="customXmlDelRangeStart"
type="CT_TrackChange"/>
<xsd:element name="customXmlDelRangeEnd"
type="CT_Markup"/>
<xsd:element name="customXmlMoveFromRangeStart" type="CT_TrackChange"/>
<xsd:element name="customXmlMoveFromRangeEnd"
type="CT_Markup"/>
<xsd:element name="customXmlMoveToRangeStart"
type="CT_TrackChange"/>
<xsd:element name="customXmlMoveToRangeEnd"
type="CT_Markup"/>
<xsd:element name="altChunk"
type="CT_AltChunk"/>
</xsd:choice>
<xsd:element name="sectPr" type="CT_SectPr" minOccurs="0" maxOccurs="1"/>
</xsd:sequence>
</xsd:complexType>
<xsd:complexType name="CT_Body">
<xsd:sequence>
<xsd:group
ref="EG_BlockLevelElts"
minOccurs="0" maxOccurs="unbounded"/>
<xsd:element name="sectPr" type="CT_SectPr" minOccurs="0" maxOccurs="1"/>
4.1. Analysis
147
</xsd:sequence>
</xsd:complexType>
<xsd:complexType name="CT_SectPr">
<xsd:sequence>
<xsd:group
ref="EG_HdrFtrReferences"
minOccurs="0" maxOccurs="6"/>
<xsd:group
ref="EG_SectPrContents"
minOccurs="0"/>
<xsd:element name="sectPrChange" type="CT_SectPrChange" minOccurs="0"/>
</xsd:sequence>
<xsd:attributeGroup ref="AG_SectPrAttributes"/>
</xsd:complexType>
<xsd:group name="EG_BlockLevelElts">
<xsd:choice>
<xsd:group
ref="EG_BlockLevelChunkElts"/>
<xsd:element name="altChunk"
type="CT_AltChunk"/>
</xsd:choice>
</xsd:group>
<xsd:group name="EG_BlockLevelChunkElts">
<xsd:choice>
<xsd:group ref="EG_ContentBlockContent"/>
</xsd:choice>
</xsd:group>
<xsd:group name="EG_ContentBlockContent">
<xsd:choice>
<xsd:element name="customXml"
type="CT_CustomXmlBlock"/>
<xsd:element name="sdt"
type="CT_SdtBlock"/>
<xsd:element name="p"
type="CT_P"/>
<xsd:element name="tbl"
type="CT_Tbl"/>
<xsd:group
ref="EG_RunLevelElts"/>
</xsd:choice>
</xsd:group>
<xsd:group name="EG_RunLevelElts">
<xsd:choice>
<xsd:element name="proofErr"
type="CT_ProofErr"/>
<xsd:element name="permStart"
type="CT_PermStart"/>
<xsd:element name="permEnd"
type="CT_Perm"/>
<xsd:element name="ins"
type="CT_RunTrackChange"/>
<xsd:element name="del"
type="CT_RunTrackChange"/>
<xsd:element name="moveFrom"
type="CT_RunTrackChange"/>
<xsd:element name="moveTo"
type="CT_RunTrackChange"/>
<xsd:group
ref="EG_MathContent"/>
<xsd:group
ref="EG_RangeMarkupElements"/>
</xsd:choice>
</xsd:group>
<xsd:group name="EG_MathContent">
<xsd:choice>
<xsd:element ref="m:oMathPara" type="CT_OMathPara"/>
<xsd:element ref="m:oMath"
type="CT_OMath"/>
</xsd:choice>
</xsd:group>
<xsd:group name="EG_RangeMarkupElements">
<xsd:choice>
148
<xsd:element
<xsd:element
<xsd:element
<xsd:element
<xsd:element
<xsd:element
<xsd:element
<xsd:element
<xsd:element
<xsd:element
<xsd:element
<xsd:element
<xsd:element
<xsd:element
<xsd:element
<xsd:element
</xsd:choice>
</xsd:group>
name="bookmarkStart"
name="bookmarkEnd"
name="moveFromRangeStart"
name="moveFromRangeEnd"
name="moveToRangeStart"
name="moveToRangeEnd"
name="commentRangeStart"
name="commentRangeEnd"
name="customXmlInsRangeStart"
name="customXmlInsRangeEnd"
name="customXmlDelRangeStart"
name="customXmlDelRangeEnd"
name="customXmlMoveFromRangeStart"
name="customXmlMoveFromRangeEnd"
name="customXmlMoveToRangeStart"
name="customXmlMoveToRangeEnd"
type="CT_Bookmark"/>
type="CT_MarkupRange"/>
type="CT_MoveBookmark"/>
type="CT_MarkupRange"/>
type="CT_MoveBookmark"/>
type="CT_MarkupRange"/>
type="CT_MarkupRange"/>
type="CT_MarkupRange"/>
type="CT_TrackChange"/>
type="CT_Markup"/>
type="CT_TrackChange"/>
type="CT_Markup"/>
type="CT_TrackChange"/>
type="CT_Markup"/>
type="CT_TrackChange"/>
type="CT_Markup"/>
CT_P
Spec Name
Tag(s)
Namespace
Spec Section
Paragraph
w:p
wordprocessingml (wml.xsd)
17.3.1.22
Schema excerpt
<xsd:complexType name="CT_P">
<xsd:sequence>
<xsd:element name="pPr" type="CT_PPr" minOccurs="0"/>
<xsd:group
ref="EG_PContent"
minOccurs="0" maxOccurs="unbounded"/>
</xsd:sequence>
<xsd:attribute name="rsidRPr"
type="ST_LongHexNumber"/>
<xsd:attribute name="rsidR"
type="ST_LongHexNumber"/>
<xsd:attribute name="rsidDel"
type="ST_LongHexNumber"/>
<xsd:attribute name="rsidP"
type="ST_LongHexNumber"/>
<xsd:attribute name="rsidRDefault" type="ST_LongHexNumber"/>
</xsd:complexType>
<xsd:group name="EG_PContent"> <!-- denormalized -->
<xsd:choice>
<xsd:element name="r"
type="CT_R"/>
<xsd:element name="hyperlink"
type="CT_Hyperlink"/>
<xsd:element name="fldSimple"
type="CT_SimpleField"/>
<xsd:element name="sdt"
type="CT_SdtRun"/>
<xsd:element name="customXml"
type="CT_CustomXmlRun"/>
<xsd:element name="smartTag"
type="CT_SmartTagRun"/>
<xsd:element name="dir"
type="CT_DirContentRun"/>
<xsd:element name="bdo"
type="CT_BdoContentRun"/>
<xsd:element name="subDoc"
type="CT_Rel"/>
<xsd:group
ref="EG_RunLevelElts"/>
</xsd:choice>
</xsd:group>
4.1. Analysis
149
<xsd:group name="EG_RunLevelElts">
<xsd:choice>
<xsd:element name="proofErr"
<xsd:element name="permStart"
<xsd:element name="permEnd"
<xsd:element name="bookmarkStart"
<xsd:element name="bookmarkEnd"
<xsd:element name="moveFromRangeStart"
<xsd:element name="moveFromRangeEnd"
<xsd:element name="moveToRangeStart"
<xsd:element name="moveToRangeEnd"
<xsd:element name="commentRangeStart"
<xsd:element name="commentRangeEnd"
<xsd:element name="customXmlInsRangeStart"
<xsd:element name="customXmlInsRangeEnd"
<xsd:element name="customXmlDelRangeStart"
<xsd:element name="customXmlDelRangeEnd"
<xsd:element name="customXmlMoveFromRangeStart"
<xsd:element name="customXmlMoveFromRangeEnd"
<xsd:element name="customXmlMoveToRangeStart"
<xsd:element name="customXmlMoveToRangeEnd"
<xsd:element name="ins"
<xsd:element name="del"
<xsd:element name="moveFrom"
<xsd:element name="moveTo"
<xsd:group
ref="EG_MathContent" minOccurs="0"
</xsd:choice>
</xsd:group>
type="CT_ProofErr"/>
type="CT_PermStart"/>
type="CT_Perm"/>
type="CT_Bookmark"/>
type="CT_MarkupRange"/>
type="CT_MoveBookmark"/>
type="CT_MarkupRange"/>
type="CT_MoveBookmark"/>
type="CT_MarkupRange"/>
type="CT_MarkupRange"/>
type="CT_MarkupRange"/>
type="CT_TrackChange"/>
type="CT_Markup"/>
type="CT_TrackChange"/>
type="CT_Markup"/>
type="CT_TrackChange"/>
type="CT_Markup"/>
type="CT_TrackChange"/>
type="CT_Markup"/>
type="CT_RunTrackChange"/>
type="CT_RunTrackChange"/>
type="CT_RunTrackChange"/>
type="CT_RunTrackChange"/>
maxOccurs="unbounded"/>
<xsd:complexType name="CT_R">
<xsd:sequence>
<xsd:group ref="EG_RPr"
minOccurs="0"/>
<xsd:group ref="EG_RunInnerContent" minOccurs="0" maxOccurs="unbounded"/>
</xsd:sequence>
<xsd:attribute name="rsidRPr" type="ST_LongHexNumber"/>
<xsd:attribute name="rsidDel" type="ST_LongHexNumber"/>
<xsd:attribute name="rsidR"
type="ST_LongHexNumber"/>
</xsd:complexType>
<xsd:group name="EG_RunInnerContent">
<xsd:choice>
<xsd:element name="t"
<xsd:element name="tab"
<xsd:element name="br"
<xsd:element name="cr"
<xsd:element name="sym"
<xsd:element name="ptab"
<xsd:element name="softHyphen"
<xsd:element name="contentPart"
<xsd:element name="noBreakHyphen"
<xsd:element name="fldChar"
<xsd:element name="instrText"
<xsd:element name="dayShort"
<xsd:element name="monthShort"
<xsd:element name="yearShort"
<xsd:element name="dayLong"
<xsd:element name="monthLong"
<xsd:element name="yearLong"
150
type="CT_Text"/>
type="CT_Empty"/>
type="CT_Br"/>
type="CT_Empty"/>
type="CT_Sym"/>
type="CT_PTab"/>
type="CT_Empty"/>
type="CT_Rel"/>
type="CT_Empty"/>
type="CT_FldChar"/>
type="CT_Text"/>
type="CT_Empty"/>
type="CT_Empty"/>
type="CT_Empty"/>
type="CT_Empty"/>
type="CT_Empty"/>
type="CT_Empty"/>
<xsd:element
<xsd:element
<xsd:element
<xsd:element
<xsd:element
<xsd:element
<xsd:element
<xsd:element
<xsd:element
<xsd:element
<xsd:element
<xsd:element
<xsd:element
<xsd:element
<xsd:element
<xsd:element
</xsd:choice>
</xsd:group>
name="annotationRef"
name="footnoteReference"
name="footnoteRef"
name="endnoteReference"
name="endnoteRef"
name="commentReference"
name="separator"
name="continuationSeparator"
name="pgNum"
name="object"
name="pict"
name="ruby"
name="drawing"
name="delText"
name="delInstrText"
name="lastRenderedPageBreak"
type="CT_Empty"/>
type="CT_FtnEdnRef"/>
type="CT_Empty"/>
type="CT_FtnEdnRef"/>
type="CT_Empty"/>
type="CT_Markup"/>
type="CT_Empty"/>
type="CT_Empty"/>
type="CT_Empty"/>
type="CT_Object"/>
type="CT_Picture"/>
type="CT_Ruby"/>
type="CT_Drawing"/>
type="CT_Text"/>
type="CT_Text"/>
type="CT_Empty"/>
<xsd:complexType name="CT_Text">
<xsd:simpleContent>
<xsd:extension base="s:ST_String">
<xsd:attribute ref="xml:space" use="optional"/>
</xsd:extension>
</xsd:simpleContent>
</xsd:complexType>
4.1. Analysis
151
152
Index
Symbols
A
add_break() (docx.text.run.Run method), 43
add_column() (docx.table.Table method), 46
add_heading() (docx.document.Document method), 33
add_latent_style()
(docx.styles.latent.LatentStyles
method), 40
add_page_break() (docx.document.Document method),
33
add_paragraph() (docx.document.Document method), 33
add_paragraph() (docx.table._Cell method), 47
add_picture() (docx.document.Document method), 33
add_picture() (docx.text.run.Run method), 43
add_row() (docx.table.Table method), 46
add_run() (docx.text.paragraph.Paragraph method), 41
add_section() (docx.document.Document method), 33
add_style() (docx.styles.styles.Styles method), 35
add_tab() (docx.text.run.Run method), 43
add_table() (docx.document.Document method), 34
add_table() (docx.table._Cell method), 47
add_text() (docx.text.run.Run method), 43
alignment (docx.table.Table attribute), 46
alignment (docx.text.paragraph.Paragraph attribute), 41
alignment (docx.text.parfmt.ParagraphFormat attribute),
42
all_caps (docx.text.run.Font attribute), 44
author (docx.opc.coreprops.CoreProperties attribute), 34
autofit (docx.table.Table attribute), 46
C
category (docx.opc.coreprops.CoreProperties attribute),
35
cell() (docx.table.Table method), 46
cells (docx.table._Column attribute), 48
cells (docx.table._Row attribute), 47
clear() (docx.text.paragraph.Paragraph method), 41
clear() (docx.text.run.Run method), 43
Cm (class in docx.shared), 51
cm (docx.shared.Length attribute), 51
color (docx.text.run.Font attribute), 44
ColorFormat (class in docx.dml.color), 50
column_cells() (docx.table.Table method), 46
columns (docx.table.Table attribute), 46
comments (docx.opc.coreprops.CoreProperties attribute),
35
complex_script (docx.text.run.Font attribute), 44
content_status (docx.opc.coreprops.CoreProperties attribute), 35
core_properties (docx.document.Document attribute), 34
CoreProperties (class in docx.opc.coreprops), 34
created (docx.opc.coreprops.CoreProperties attribute), 35
cs_bold (docx.text.run.Font attribute), 44
cs_italic (docx.text.run.Font attribute), 44
153
E
element (docx.styles.latent._LatentStyle attribute), 40
element (docx.styles.latent.LatentStyles attribute), 40
element (docx.styles.style.BaseStyle attribute), 36
element (docx.styles.styles.Styles attribute), 36
emboss (docx.text.run.Font attribute), 44
Emu (class in docx.shared), 51
emu (docx.shared.Length attribute), 51
K
keep_together (docx.text.parfmt.ParagraphFormat attribute), 42
keep_with_next (docx.text.parfmt.ParagraphFormat attribute), 42
keywords (docx.opc.coreprops.CoreProperties attribute),
35
Index
O
orientation (docx.section.Section attribute), 49
outline (docx.text.run.Font attribute), 45
page_break_before (docx.text.parfmt.ParagraphFormat
attribute), 42
page_height (docx.section.Section attribute), 49
page_width (docx.section.Section attribute), 49
Paragraph (class in docx.text.paragraph), 41
paragraph_format (docx.styles.style._ParagraphStyle attribute), 38
paragraph_format (docx.styles.style._TableStyle attribute), 39
paragraph_format (docx.text.paragraph.Paragraph attribute), 41
ParagraphFormat (class in docx.text.parfmt), 42
paragraphs (docx.document.Document attribute), 34
paragraphs (docx.table._Cell attribute), 47
part (docx.document.Document attribute), 34
priority (docx.styles.latent._LatentStyle attribute), 41
priority (docx.styles.style._CharacterStyle attribute), 37
priority (docx.styles.style._ParagraphStyle attribute), 38
T
priority (docx.styles.style._TableStyle attribute), 39
priority (docx.styles.style.BaseStyle attribute), 36
Table (class in docx.table), 46
Pt (class in docx.shared), 51
table (docx.table._Column attribute), 48
pt (docx.shared.Length attribute), 51
table (docx.table._Columns attribute), 48
table (docx.table._Row attribute), 48
Q
table (docx.table._Rows attribute), 48
quick_style (docx.styles.latent._LatentStyle attribute), 41 table_direction (docx.table.Table attribute), 47
quick_style (docx.styles.style._CharacterStyle attribute), tables (docx.document.Document attribute), 34
37
tables (docx.table._Cell attribute), 47
quick_style (docx.styles.style._ParagraphStyle attribute), text (docx.table._Cell attribute), 47
38
text (docx.text.paragraph.Paragraph attribute), 42
quick_style (docx.styles.style._TableStyle attribute), 39
text (docx.text.run.Run attribute), 44
quick_style (docx.styles.style.BaseStyle attribute), 36
theme_color (docx.dml.color.ColorFormat attribute), 50
title (docx.opc.coreprops.CoreProperties attribute), 35
Index
155
U
underline (docx.text.run.Font attribute), 46
underline (docx.text.run.Run attribute), 44
unhide_when_used (docx.styles.latent._LatentStyle attribute), 41
unhide_when_used (docx.styles.style._CharacterStyle attribute), 37
unhide_when_used (docx.styles.style._ParagraphStyle attribute), 38
unhide_when_used (docx.styles.style._TableStyle attribute), 39
unhide_when_used
(docx.styles.style.BaseStyle
attribute), 36
V
version (docx.opc.coreprops.CoreProperties attribute), 35
W
web_hidden (docx.text.run.Font attribute), 46
widow_control
(docx.text.parfmt.ParagraphFormat
attribute), 43
width (docx.shape.InlineShape attribute), 50
width (docx.table._Cell attribute), 47
width (docx.table._Column attribute), 48
156
Index