[TAG] linux command to read .odt ?

J. Bakshi j.bakshi at unlimitedmail.org
Tue Jun 9 19:41:52 MSD 2009


Dear all,

Like catdoc ( to read .doc) is there any command to read .odt from
command line ? did a lot googling but not found any such command like
catdoc.

On the other hand I have found that .odt is actually stored in zip
format. So I have executed unzip on a .odt and It successfully
extracted a lot of files including "content.xml"  which actually have
the content :-)

Is there any tool which can extract the plain text from .xml ?

Please suggest.


The content.xml looks like

````````````````````````````````````
<office:document-content office:version="1.2">
<office:scripts/>
-
<office:font-face-decls>
<style:font-face style:name="Times New Roman" svg:font-family="'Times
New Roman'" style:font-family-generic="roman"
style:font-pitch="variable"/>
<style:font-face style:name="Arial" svg:font-family="Arial"
style:font-family-generic="swiss" style:font-pitch="variable"/>
<style:font-face style:name="Arial1" svg:font-family="Arial"
style:font-family-generic="system" style:font-pitch="variable"/>
</office:font-face-decls>
<office:automatic-styles/>
-
<office:body>
-
<office:text>
-
<text:sequence-decls>
<text:sequence-decl text:display-outline-level="0" text:name="Illustration"/>
<text:sequence-decl text:display-outline-level="0" text:name="Table"/>
<text:sequence-decl text:display-outline-level="0" text:name="Text"/>
<text:sequence-decl text:display-outline-level="0" text:name="Drawing"/>
</text:sequence-decls>
<text:p text:style-name="Standard">This is a test </text:p>
</office:text>
</office:body>
</office:document-content>
``````````````````````````````````````````````````````

Note the content

````````````````````````````````````
<text:p text:style-name="Standard">This is a test </text:p>
`````````````````````````````````````````````````````




More information about the TAG mailing list